Although I used to program as a hobby, my eyes bugged out when I saw this article. It's actually quite interesting; I finally realize why the hell people program in lower level languages.
One point that I think could be addressed is backward compatibilty. I really know nothing about this, but don't the versions of the abstractions have to be fairly compatible with each other, especially on a large, distributed system? This extra abstraction of an abstraction has to be orders of magnitude more leaky. The best example I can think of is Windows.
Re:Informative
by
Bastian
·
· Score: 4, Interesting
I think backward(really more slantwise or sideways) compatibility is almost certainly one of the reasons behind why C++ treats string literals as arrays of characters.
I program in C++, but link to C libraries all the time. I also pass string literals into functions that have char* parameters. If C++ didn't treat string literals as char*, that would be impossible.
Re:Informative
by
binaryDigit
·
· Score: 5, Interesting
I think it's a mistake to simply say that "high level languages make for buggier/bloated code". After all, many abstractions are created to solve common problems. If you don't have a string class then you'll either roll your own or have code that is complex and bug prone from calling 6 different functions to append a string. I don't think anyone would agree that it's better to write your own line drawing algorithm and have to program directly to the video card, vs calling one OpenGL method to do the same (well unless you need the absolute last word in performance, but that's another topic).
Exactly. The only way to do something more easily or more efficiently is to restrict your scope. If you know something about a particular operation, or if you can make a few assumptions about it, your life because much easier. Take sorting, for example. Comparison sorts run (at best) in Omega(n log n) time. However, if you know the maximum range of numbers k in a set of length n, and k is much smaller than n, you can use a counting sort and do it in Theta(n) time. But what happens if you put a k+1 number in there? Well, all hell breaks loose.
Another example: Java provides a pretty nifty mail API that you can use to create any kind of E-mail you can dream up in 20 lines of code or so. But you only ever want to send E-mail with a text/plain bodypart and a few attachments. So you make a class that does just that, and save yourself 15 lines of code every time you send mail. But suppose you want to send HTML E-mail, or you want to do something crazy with embedded bodyparts? Well it's not in the scope, so it's back to the old way.
In order to abstract you have to reduce your scope somehow, and you have to ensure that certain parameters are within your scope (which adds overhead). And sometimes there's just nothing you can do about that overhead (like in TCP). And occasionally (if you abstract too much) you limit your scope to the point where your code can't be re-used.
And as you abstract you tend to pile up a list of dependencies. Every library you abstract from needs to be included in addition to your library (assuming you use DLLs). So yes, there are maintenance and versioning headaches involved.
Bottom line: non-trivial abstraction saves time up front, but costs later, mostly in the maintenance phase. There's probably some fixed kharmic limit to how much can be simplified beyond which any effort spent simply in displaces the problem.
Re:Informative
by
__past__
·
· Score: 4, Interesting
If you're curious, yes, there was a B, but there was not actually an A (or rather, there was, but it was called ALGOL).
Between ALGOL and B, there was BCPL (and CPL before that). Hence there was a dispute whether the language following C should be called D or P (and AFAIK, for each name there were several experimental languages that all didn't succeed), until C++ became popular.
Re:Informative
by
GlassHeart
·
· Score: 4, Informative
C++ is not a seperate language from C,
it is merely an incremental improvement
C++ is first of all definitely a separate
language, in the sense that a C++ compiler
will fail to compile legal C code. (Many
compilers accept both C and C++ code, but
must necessarily process them as either C
or C++, not both.) If C and C++ are not
"separate languages", then converting code
from C to C++ or C++ to C must be a trivial
task.
C++ is also a separate language in the sense
that good C++ code (the definition of which
does seem to differ depending on which edition
of Stroustrup you look at) looks little like
good C code. The STL (and templates in
general) and exceptions result in source code
that looks little like C.
it is merely an incremental improvement,
an add-on basically. That's why it's
called C++ and not D.
Stroustrup wrote: "I picked C++ because it
was short, had nice interpretations, and
wasn't of the form ``adjective C.'' in his
own FAQ. No mention of emphasis
on C++ "merely" being an "incremental
improvement".
If you're curious, yes, there was a B,
but there was not actually an A (or rather,
there was, but it was called ALGOL).
Re:Informative
by
GlassHeart
·
· Score: 5, Informative
Every assembly instruction directly maps
to a machine code instruction, so there is
absolutely nothing hidden or being done
behind the scenes.
Nonsense. On the 80x86, for example, a
one-pass assembler cannot know if a forward
JMP (jump) instruction is a "near jump"
(8 bit offset) or a "far jump" (16 bit
offset). It must generate code to assume
the worst, so it tentatively creates a
"far jump" and makes a note of this, because
it doesn't know where it must jump to yet.
In the backpatching phase, it may now know
that the jump was actually "near", so it
changes the instruction to a "near jump",
fills in the 8-bit offset, and overwrites
the spare 8 bits with a NOP (no operation)
instead of shifting every single instruction
below it up by one byte.
A multi-pass assembler can avoid the NOP,
but the fact is still that the same JMP
assembly instruction can map to two
distinct machine language sequences. The
two different kinds of JMP are abstracted
and hidden from the programmer.
Typically, assemblers also provide:
Symbolic constants
Symbolic addresses
Macro definition and expansion
Numeric operators and conversion on
constants
Strings
which are all useful abstractions.
Re:Informative
by
GlassHeart
·
· Score: 4, Informative
Also, could you give an example of some valid C that isn't valid C++? I have yet to encounter any.
The most commonly-encountered difference is
probably:
#include <stdio.h> ... char *p = malloc(100);
which is perfectly valid (and good) C. It is invalid
C++ because the void * return type of malloc()
must be explicitly cast to (char *). However:
would actually be substandard C. Since C
assumes that an unprototyped function returns
int, forgetting to include stdio.h would
generate an error, which is silenced by the
explicit cast.
Furthermore, the most recent iteration of
ANSI C, known as C99, contains many features
not supported in C++.
The underlying problem with programming
by
Jack+Wagner
·
· Score: 5, Insightful
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry. We have these high level tools which most programmers really don't understand and are taught that they don't need to understand in order to build these sophisticated products.
Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.
The one shining example of this is FreeBSD, which is based totally on low level C programs and they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty. The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX.
Warmest regards, --Jack
--
Wagner LLC Consulting Co. - Getting it right the first time
Re:The underlying problem with programming
by
binaryDigit
·
· Score: 4, Insightful
Well I'd agree up to a point. The fact is that FreeBSD is trying to solve a different problem/attract a different audience than XP/OSX. If FreeBSD was forced to add all the "features" of the other two in an attempt to compete in that space, then it would suffer mightily. You also have to take into account the level/type of programmers working on the these projects. While FreeBSD might have a core group of seasoned programmers working on it, the other two have a great range of programming experience working on it. A few guys who know what they're doing working on a smaller featureset would always produce better stuff than a large group of loosely coupled and widely differing talents working on a monsterous feature set.
Re:The underlying problem with programming
by
jorleif
·
· Score: 5, Insightful
The real problem is not the existance of high-level abstractions, but the fact that many programmers are unwilling or unable to understand the abstraction.
So you say "let's get rid of encapsulation". But that doesn't solve this problem, because this problem is one of laziness or incompetence rather than not being allowed to touch what's inside the box. Encapsulation solves an entirely different problem, that is the one of modularity. If we abolish encapsulation the same clueless programmers will just produce code that is totally dependent on some obscure property in a specific version of a library. They still won't understand what the library does, so we're in a worse position than when we started.
Re:The underlying problem with programming
by
Yokaze
·
· Score: 5, Insightful
Don't blame the tools.
High level languages and abstractions aren't the problem, neither are pointers in low level languages. It's the people, who can't use them.
Abstraction does mean that you should not have to care about the underlying mechanisms, not that you should not understand them.
-- "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Re:The underlying problem with programming
by
radish
·
· Score: 5, Insightful
And inevitably, at some point in a programmer's career, they'll come across a system in which the only available development tool is an assembler
Do you REALLY believe that? Are you mad? I can be pretty sure that in my career I will never be required to develop in assembler. And even if I do, I just have to brush up on my asm - big deal. To be honest, if I was asked to do that I'd probably quit anyway, it's not something I enjoy.
Sure it's important to understand what's going on under the hood, but you have to use the right tools for the right job. No one would cut a lawn with scissors, or someones hair with a mower. Likewise I wouldn't write a FPS game in prolog or a web application in asm.
The real point is that people have to get out of the "one language to code them all" mentality - you need to pick the right language and environment for the task at hand. From a personal point of view that means haveing a solid enough grasp of the fundamentals AT ALL LEVELS (i.e. including high and low level languages) to be able to learn the skills you inevitably won't have when you need them.
Oh, and asm is just an abstraction of machine code. If you're coding in anything except 1's and 0's you're using a high(er) level language. Get over it.
--
----
Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Re:The underlying problem with programming
by
Junks+Jerzey
·
· Score: 5, Insightful
After 5 years of programming, my favorite language has become assembler - not because I hate HLL's, but rather, because you get exactly what you code in assembler. There are no "Leaky Abstractions" in assembly.
Ah, but you are wrong, and I'm speaking as someone who has written over 100,000 lines of assembly code. The great majority of the time, when you're faced with a programming problem, you don't want to think about that problem in terms of bits and and bytes and machine instructions and so on. You want to think about the problem in a more abstract way. After all, programming can be extremely difficult, and if you focus on the minute then you may never come up with a solution. And many high level abstractions simply do not exist in assembly language.
What does a closure look like in assembly? It doesn't exist as a concept. Even if you write code using closures in Lisp, compile to assembly language, and then look at the assembly language, the concept of a closure will not exist in the assembly listing. Period. Because it's a higher level concept. It's like talking about a piece of lumber when you're working on a molecular level. There's no such thing when you're viewing things in such a primitive way. "Lumber" only becomes a concept when you have a macroscopic view. Would you want to build a house using individual molecules or would you build a house out of lumber or brick?
Re:The underlying problem with programming
by
Chris+Mattern
·
· Score: 5, Insightful
> There are no "Leaky Abstractions" in assembly.
At this point, may I whisper the word "microcode" in your ear?
Chris Mattern
Re:The underlying problem with programming
by
YU+Nicks+NE+Way
·
· Score: 5, Interesting
And I had my mod points expire this morning...
He's exactly right. No leaky abstractions? I once worked on a project that was delayed six months because a simple, three-line assembler routine that had to return 1 actually returned something else about one time in a thousand. The code was basically "Load x 5 direct; load y addr ind; subt x from y in place", where we could see in the logic analyzer showing the contents in the address which was to be moved into register y was 6. Literally, 999 times in a thousand, that left a 1 in register y. The other time...
We sent the errata off to the manufacturer, who had the good grace to be horrified. It then took six months to figure out how to work around the problem.
And, hey, guess what? Semiconductor holes are a leaky abstraction, too. And don't get me started on subatomic particles.
Re:timeout
by
Minna+Kirai
·
· Score: 4, Informative
"Reliable" means "always works", it doesn't mean "always obeys the spec". (Unless you use a circular definition)
A timeout is a legal result by the TCP specification, but it's not reliable, because your data didn't make it through.
By the IP specification, your data might not make it either- and that's a legal result because the spec allows it to drop packets for any reason at all. That doesn't mean IP is reliable, just that it obeys its own definition.
Of course, no real protocol can ever meet this restrictive definition of reliable. Some maniac can always cut through your wires or incinerate your CPUs. Calling TCP a "reliable protocol" is just a shorthand for "as much more reliable than the underlying protocols as we could manage"
The timeout you mention does make TCP more reliable than IP, because it alerts you to the data loss, where the application can possibly take steps to retransmit it sometime in the future. But its not as if TCP could ever achieve the perfect reliablity that the simplest, most abstract description of it would imply. Which is why, as the author says, those who rely on the abstractions can get bitten later.
Re:even for non-programmers
by
Nexus7
·
· Score: 4, Interesting
I guess a "liberal arts" major would be considered the quintessential "non-programmer". Certainly these people profess a non-concern for most technology, and of course, computing. I don't mean that they wouldn't know about Macs and PCs and Word, but we can agree that is a very superficial view of computing. But appreciating an article such as this "leaky abstractions" required some understanding of the way the networks work, even if there isn't any heavy math in it. In other words, the non-programmer wouldn't understand what the fuss is about.
But that isn't how it's supposed to be. Liberal arts people are supposed to be interested in precisely this kind of thing, because it takes a higher level view of something that is usually presented in a way that only a CS major would find interesting or useful, and generalizes an idea to be applicable beyond the specific subject, networking.
That is, engineers are today's liberal arts majors. It's time to get the so called "liberal arts" people out from politics, humanities, governance, management and other fields of importance because they just aren't trained to have or look for the conceptual basis of decision making and correctly apply it.
The ultimate leaky abstraction
by
nounderscores
·
· Score: 5, Insightful
Is our own bodies.
I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.
I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle.
When you think about it, most of basic science, some religeon and all of medicine has been about removing layers of abstraction to try and fix things when they go wrong.
Visual Basic and abstraction breakdown
by
RobertB-DC
·
· Score: 5, Interesting
As a VB programmer, I've *lived* leaky abstractions. Nowhere has it been more obvious than in the gigantic VB app our team is responsible for maintaining. 262.frm files, 36.bas modules, 25.cls classes, and a handful of.ctl's.
Much of our troubles, though, come from a single abstraction leak: the Sheridan (now called Infragistics) Grid control.
Like most VB controls, the Sheridan Grid is designed to be a drop-in, no-code way to display database information. It's designed to be bound to a data control, which itself is a drop-in no-code connection to a database using ODBC (or whatever the flavor of the month happens to be).
The first leak comes in to play because we don't use the data control. We generate SQL on the fly because we need to do things with our queries that go beyond the capabilities of the control, and we don't save to the database until the client clicks "OK". Right away, we've broken the Sheridan Grid's paradigm, and the abstraction started to leak. So we put in buckets -- bucketfuls of code in obscure control events to buffer up changes to be written when the form closes.
Just when things were running smoothly, Sheridan decided to take that kid with his finger in the dike and send him to an orphanage. They "upgraded" the control. The upgrade was designed to make the control more efficient, of course... but we don't use the data control! It completely broke all our code. Every single grid control in the application -- at least one and usually more in each of 200+ forms -- had to have all-new buckets installed to catch the leaks.
You may be wondering by now why we haven't switched to a better grid control. Sure enough, there are controls out there now that would meet 95% of our needs... but 1) that 5% has high client visibility and 2) the rest of the code works, by golly! No way we're going to rip it out unless we're absolutely forced to.
By the way, our application now compiles to a svelte 16.9 MEG...
-- Stressed? Me?
Of course not.
Stress is what a rubber band feels before it breaks, silly.
TCP for the bored
by
mekkab
·
· Score: 5, Insightful
FIne it's relaibale becasue of acks, timeouts, adaptive re-transmit timeouts that take statistical averages of RTT times, exponential back-off and slow start, window acks which keep track of what bytes are received, etc.
So in your case of timing out N, re-tx'ing N, and then getting the repsonse to the first N back after sending the second N, you do two things: 1) Good! You got yr packet! 2) keep track of how many bytes you have received thsu far (TCP is not sending messages, it is sending a stream) 3) when you get the response from your second request, discard it, becuase you already received those bytes from the stream. 4) since you timed out, DON'T use the Round TRip Time for that reponse: slow down your expected RTT time, and THEN start measuring.
And guess what? If I unplug the NIC of the other machine, there is no reliable way of transmitting that data (assuming your destination machine isn't dual homed)- so I keep streaming bytes to a TCP socket and I don't find out my peer is gone for approx. 2 minutes. WOW. There's nothing reliable about that boundary condition!
my point is TCP is reliable ENOUGH. But I wouldn't equate it with a Maytag warranty. It is not a panacea. Infact, for a closed homogenous network I wouldn't even consider it the best option. But if the boundary conditions fall within the acceptible fudge range (remember Real Time human grade systems are not 100% reliable, only 99.99999% and much of that is achieved through redundancy) your leaks are ok.
-- In the future, I would want to not be isolated from my friends in the Space Station.
Time to market is the factor, not elegance
by
Ars-Fartsica
·
· Score: 5, Insightful
This argument is so tired. The downfall of programming is now due to people who can't/don't write C. Twenty years before that the downfall of programming was C programmers who couldn't/wouldn't write assembler.
The market rewards abstractions because they help create high level tools that get products on the market faster. Classic case in point is WordPerfect. They couldn't get their early assembler-based product out on a competitive schedule with Word or other C based programs.
Re:Time to market is the factor, not elegance
by
ChaosDiscord
·
· Score: 5, Insightful
The market rewards...
I'd suggest stearing clear of that phrase if your intention is to indicate that something is "good". It's also completes with things like "The market rewards skilled con men who disappear before you realize you've been rooked" and "The market rewards CEOs who destroy a company's long term future to boost short term stock value so he can cash out and retire."
I'm all in favor of good abstractions, good abstractions will help make us more efficient. But even the best abstractions occasionally fail, and when they fail a programmer needs to be able to look beneath the abstraction. If you're unable to work below and without the abstraction, you'll be forced to call in external help which may cost you any of time, money, showing people you don't entirely trust your proprietary code, and being at the mercy of an external source. Sometimes this trade off is acceptable (I don't really have the foggest idea how my car works, when it breaks I put myself at the mercy of my auto shop). Perhaps we're even moving to a world where you have high level programmers that occasionally call in low level programmers for help. But you can't say that it's always best to live at the highest level of abstraction possible. You need to evaluate the benefits for each case individually.
You point out that many people complain that some new programmers can't program C, while twenty years ago the complaint was the some new programmers can't program assembly. Interestingly both are right. If you're going to be skilled programmer you should have at least a general understanding of how a processor works and assembly. Without this knowledge you're going to be hard pressed to understand certain optimizations and cope with catastrophic failure. If you're going to write in Java or Python, knowing how the layer below (almost always C) works will help you appreciate the benefits of your higher level abstraction. You can't really judge the benefits of one language over another if you don't understand the improvements each tries to make over a lower level language. To be a skilled generalist programmer, you really need at least familiarity with every layer below the one you're using (this is why many Computer Science desgrees include at least one simple assembly class and one introductory electronics class).
This is an overrated rant about bad coding
by
PureFiction
·
· Score: 5, Insightful
Proper abstractions avoid unintended side-effects by presenting a clean view of the intent and function of a given interface, and not just a collection of methods or structures.
When I read what Joel wrote about "leaky abstractions" i saw a peice complaining about "unintended side-effects". I don't think the problem is with abstractions themselves, but rather the implementation.
He lists some examples:
1. TCP - This is a common one. Not only does TCP itself have peculiar behavior in less than ideal conditions, but it is also interfaced with via sockets, which compound the problem with an overly complex API.
If you were to improve on this and present a clean reliable stream transport abstraction is would likely have a simple connection establishment interface and some simple read/write functionality. Errors would be propagated up to a user via exceptions or event handlers. But the point I want to make is that This problem can be solved with a cleaner abstraction.
2. SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries. There is no way to ensure that a query runs fastest on all systems. Every system and environment is going to have different amounts and types of data. The amount of data in a table, the way it is indexed, and the relationship between records is what determines a queries speed. There will always be manual performance tweaking of truly complex SQL simply because every scenario is different and the best solution will vary.
3. C++ string classes. I think this is another straw man. Templates and pointers in C++ are hard. That is all there is too it. Most Visual Basic only coders will not be able to wrap their minds around the logic that is required to write complex c++ template code. No matter how good the abstractions get in C++, you will always have pointers, templates, and complexity. Sorry Joel, your VB coders are going to have to avoid c++ forever. There is simply no way around it. This abstraction was never meant to make things simple enough for Joe Programmer, but rather to provide an extensible, flexible tool for the programmer to use when dealing with string data. Most of the time this is simpler, sometimes it is more complex (try writing your own derived string class - there are a number of required constructors you must implement which are far from obvious) but the end result is that you have a flexible tool, not a leaky abstraction.
There are some other examples, but you see the point. I think Joel has a good idea brewing regarding abstractions, complexity, and managing dependencies and unintended side-effects, but I do not think the problem is anywhere near as clear cut as he presents. As a discipline software engineering has a horrible track record of implementing arcane and overly complex abstractions for network programming (sockets and XTI) generic programming (templates, ref counting, custom allocators) and even operating systems API's (POSIX).
Until we can leave behind all of the cruft and failed experiments of the past, start new with complete and simple abstractions that do not mask behavior, but rather recognize it and provide a mechansim to handle it gracefully, we will run into these problems.
Luckily, such problems are fixable - just write the code. If joel were right and complex abstractions were fundamentally flawed, that would be a dark picture indeed for the future of software engineering (it is only going to grow ever more complex from here kids - make no mistake about it).
Pessimism gone rampant
by
jneemidge
·
· Score: 5, Insightful
This article reminds me of what I hated most about Jurassic Park (the novel -- the movie blessly omits the worst of it) -- Ian Malcolm's runaway pessimism. The arguments boil down to be very similar. Ian Malcolm says that complex systems are so complex we can't ever understand them all, so they're doomed to fail. Joel Spolsky says that our high-level abstractions will fail and because of that we're doomed to need to understand the lower-level stuff. I have problems with both -- they're a sort of technopessimism that I find particularly offensive, because they make the future sound bleak and hopeless despite volumes of evidence that, in fact, we've been dealing successfully with these issues for decades and they're just not all that bad.
We have examples of massively complex systems that work very reliably day-in and day-out. Jet airplanes, for one; the national communications infrastructure, for another. Airplanes are, on the whole, amazingly reliable. The communications infrastructure, on the other hand, suffers numerous small faults, but they're quickly corrected and we go on. Both have some obvious leaky abstractions.
The argument works out to be pessimism, pure and simple -- and unwarrented pessimism to boot. If it were true that things were all that bad, programmers would all _need_ to understand, in gruesome detail, the microarchitectures they're coding to, how instructions are executed, the full intricacies of the compiler, etc. All of these are leaky abstractions from time to time. They'd also need to understand every line of libc, the entire design of X11 top to bottom, and how their disk device driver works. For almost everyone, this simply isn't true. How many web designers, or even communications applications writers, know -- to the specification level -- how TCP/IP works? How many non-commo programmers?
The point is that sometimes you need to know a _little bit_ about the place where the abstraction can leak. You don't need to know the lower layer exhaustively. A truly competant C programmer may need to know a bit about the architecture of their platform (or not -- it's better to write portable code) but they surely do not need to be a competant assembly programmer. A competant web designer may need to know something about HTML, but not the full intricacies of it. And so forth.
Yes, the abstractions leak. Sometimes you get around this by having one person who knows the lower layer inside and out. Sometimes you delve down into the abstraction yourself. And sometimes, you say that, if the form fails because it needs JavaScript and the user turned off JavaScript, it's the user's fault and mandate JavaScript be turned on -- in fact, a _good_ high-level tool would generate defensive code to put a message on the user's screen telling them that, in the absence of JavaScript, things will fail (i.e. the tool itself can save the programmer from the leaky abstraction).
What Ian Malcolm says, when you boil it all down, is that complex systems simply can't work in a sustained fashion. We have numerous examples which disprove the theory. That doesn't mean that we don't need to worry about failure cases, it means we overengineer and build in failsafes and error-correcting logic and so forth. What Joel Spolsky says is that you can't abstract away complexity because the abstractions leak. Again, there are numerous examples where we've done exactly that, and the abstraction has performed perfectly adequately for the vast majority of users. Someone needs to understand the complex part and maintain the abstraction -- the rest of us can get on with what we're doing, which may be just as complex, one layer up. We can, and do, stand on the shoulders of giants all the time -- we don't need to fully understand the giants to make use of their work.
Re:even for non-programmers
by
jaredcoleman
·
· Score: 5, Insightful
Very funny! I agree that the average Joe is still going to be lost with the technical aspects of this article, but the author does generalize...
And you can't drive as fast when it's raining, even though your car has windshield wipers and headlights and a roof and a heater, all of which protect you from caring about the fact that it's raining (they abstract away the weather), but lo, you have to worry about hydroplaning (or aquaplaning in England) and sometimes the rain is so strong you can't see very far ahead so you go slower in the rain, because the weather can never be completely abstracted away, because of the law of leaky abstractions
I've heard a lot of people say that they can't believe how many homes, schools, and other buildings were destroyed by the huge thunderstorms that hit the states this past weekend, or that many people died. Hello, we haven't yet figured out how to control everything! American (middle to upper-class) life is a leaky abstaction. We find this out when we have a hard time coping with natural things that shake up our perfect (abstacted) world. That is what we all need to understand.
Neal Stephenson...
by
mikeee
·
· Score: 4, Interesting
Neal Stephenson talks about something similar in In the Beginning was the Command Line. He calls it interface shear; he's specificially referring the the UI as an abstraction (an interesting idea in itself). His take on it was that abstractions are metaphors, and that "interface shear"/"leaky abstractions" occur in regions where the metaphors break down.
In physics the abstractions leak. Newton's laws leak like crazy. Einstein's theories leak. Presently there are no fundamental theories in physics which don't leak like crazy when quantum mechanics and gravity interact.
In sports the abstractions leak. That's how we get players like Gretzky and pay a lot of money to watch what they do.
And how about the reason why didn't C++ didn't define a native string type. Because there isn't any way to implement a string class that serves all possible applications. The premise of C++ is not being stuck with someone else's choice on what part of the abstraction should leak. Because C++ doesn't define a native string type, the user is free to replace the default standard string implementation with any other string implementation and have it integrate with the language on an equal footing with the standard string type.
If a language is imposes standard abstractions it only takes one abstraction you can't live with to make that choice of language untenable. Which is how C++ has been so successful despite being the worst of all possible languages (except for all the others).
While I usually like Joel's work, I'm pissed about the random jab at C++. For those he didn't read the article, he says something along the lines of
"A lot of the stuff the C++ committe added to the language was to support a string class. Why didn't they just add a built-in string type?"
It's good that a string class wasn't added, because that lead to templates being added! And templates are the greatest thing, ever!
The comment shows a total lack of understanding of post-template, modern C++. People are free not to like C++ (or aspects of it) and to disagree with me about templates, of course, and in that case I'm fine with them taking stabs at it. But I get peeved when people who have just given the language a cursory glance try to fault it. If you haven't used stuff like Loki or Boost, or taken a look at some of the fascinating new design techniques that C++ has enabled, then you're in no place to comment about the language. At least read something like the newer editions of D&E or "The C++ Programming Language" then read "Modern C++" before spouting off.
PS> Of course, I'm not accusing the author of being unknowledgable about C++ or anything of the sort. I'm just saying that this particular comment sounded rather n00b'ish, so to speak.
-- A deep unwavering belief is a sure sign you're missing something...
The Irony of the Shiny New Thing
by
Badgerman
·
· Score: 5, Interesting
Loved this article. Sent it on to my manager and a co-worker.
One thing I liked especially is the danger of the Shiny New Thing. It may be neat and cool and save time, but knowing how to use it does not mean that you can do anything else - or function outside of it.
Right now I'm on an ASP.NET project - and some ASP.NET stuff I actually like. But the IDE actually makes it harder to program responsibly, and even utilize.NET effectively. Unless one understands some of the underpinnings of this NEW technology, you actually can't take advantage of it. Throw in the generated code issues and the IDE, an abstraction of an abstraction, really is disadvantageous.
A friend of mine just about strangled some web developers he worked with as they ONLY use tools (and they love all the Shiny New Ones) and barely know what the tools produce. This has led to hideous issues of having to configure servers and designs to work with their products as opposed to them actually knowing how they work. The guy's a saint, I swear.
I think managers and employers need to be aware of how abstract things can get, and realize good programmers can "drill down" from one layer to another to fix things. A Shiny New Thing made with Shiny New Things does NOT mean the people who did it are talented programmers, or that they can haul your butt out of a jam when the Shiny New Thing looses its shine.
-- "The Sage treasures Unity and measures all things by it" - Lao Tzu
Re:Here goes....
by
Junks+Jerzey
·
· Score: 4, Insightful
Please tell me what you think of this - I would honestly like to know.
I've worked in a way similar to you, and I might still if it were as mindlessly simple to write assembly language programs under Windows as it was back in the day of smaller machines (i.e. no linker, no ugly DLL calling conventions, smaller instruction set, etc.). In addition to being fun, I agree in that assembly language is very useful when you need to develop your own abstractions that are very different from other languages, but it's a fine line. First, you have to really gain something substantial, not just a few microseconds of execution time and an executable that's ten kilobytes smaller. And second, sometimes you *think* you're developing a simpler abstraction, but by the time you're done you really haven't gained anything. It's like the classic newbie mistake of thinking that it's trivial to write a faster memcpy.
These days, I prefer to work the opposite way in these situations. Rather than writing directly in assembly, I try to come up with a workable abstraction. Then I write a simple interpreter for that abstraction in as high a level language as I can (e.g. Lisp, Prolog). Then I work on ways of mechanically optimizing that symbolic representation, and eventually generate code (whether for a virtual machine or an existing assembly language). This is the best of both worlds: You get your own abstraction, you can work with assembly language, but you can mechanically handle the niggling details. If I come up with an optimization, then I can implement it, re-convert my symbolic code, and there it is. This assumes you're comfortable with the kind of programming promoted in books the _Structure and Interpretation of Computer Programs_ (maybe the best programming book ever written). To some extent, this is what you are doing with your macros, but you're working on a much lower level.
Although I used to program as a hobby, my eyes bugged out when I saw this article. It's actually quite interesting; I finally realize why the hell people program in lower level languages.
One point that I think could be addressed is backward compatibilty. I really know nothing about this, but don't the versions of the abstractions have to be fairly compatible with each other, especially on a large, distributed system? This extra abstraction of an abstraction has to be orders of magnitude more leaky. The best example I can think of is Windows.
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry. We have these high level tools which most programmers really don't understand and are taught that they don't need to understand in order to build these sophisticated products.
Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.
The one shining example of this is FreeBSD, which is based totally on low level C programs and they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty. The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX.
Warmest regards,
--Jack
Wagner LLC Consulting Co. - Getting it right the first time
"Reliable" means "always works", it doesn't mean "always obeys the spec". (Unless you use a circular definition)
A timeout is a legal result by the TCP specification, but it's not reliable, because your data didn't make it through.
By the IP specification, your data might not make it either- and that's a legal result because the spec allows it to drop packets for any reason at all. That doesn't mean IP is reliable, just that it obeys its own definition.
Of course, no real protocol can ever meet this restrictive definition of reliable. Some maniac can always cut through your wires or incinerate your CPUs. Calling TCP a "reliable protocol" is just a shorthand for "as much more reliable than the underlying protocols as we could manage"
The timeout you mention does make TCP more reliable than IP, because it alerts you to the data loss, where the application can possibly take steps to retransmit it sometime in the future.
But its not as if TCP could ever achieve the perfect reliablity that the simplest, most abstract description of it would imply. Which is why, as the author says, those who rely on the abstractions can get bitten later.
I guess a "liberal arts" major would be considered the quintessential "non-programmer". Certainly these people profess a non-concern for most technology, and of course, computing. I don't mean that they wouldn't know about Macs and PCs and Word, but we can agree that is a very superficial view of computing. But appreciating an article such as this "leaky abstractions" required some understanding of the way the networks work, even if there isn't any heavy math in it. In other words, the non-programmer wouldn't understand what the fuss is about.
But that isn't how it's supposed to be. Liberal arts people are supposed to be interested in precisely this kind of thing, because it takes a higher level view of something that is usually presented in a way that only a CS major would find interesting or useful, and generalizes an idea to be applicable beyond the specific subject, networking.
That is, engineers are today's liberal arts majors. It's time to get the so called "liberal arts" people out from politics, humanities, governance, management and other fields of importance because they just aren't trained to have or look for the conceptual basis of decision making and correctly apply it.
Is our own bodies.
I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.
I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle.
When you think about it, most of basic science, some religeon and all of medicine has been about removing layers of abstraction to try and fix things when they go wrong.
As a VB programmer, I've *lived* leaky abstractions. Nowhere has it been more obvious than in the gigantic VB app our team is responsible for maintaining. 262 .frm files, 36 .bas modules, 25 .cls classes, and a handful of .ctl's.
Much of our troubles, though, come from a single abstraction leak: the Sheridan (now called Infragistics) Grid control.
Like most VB controls, the Sheridan Grid is designed to be a drop-in, no-code way to display database information. It's designed to be bound to a data control, which itself is a drop-in no-code connection to a database using ODBC (or whatever the flavor of the month happens to be).
The first leak comes in to play because we don't use the data control. We generate SQL on the fly because we need to do things with our queries that go beyond the capabilities of the control, and we don't save to the database until the client clicks "OK". Right away, we've broken the Sheridan Grid's paradigm, and the abstraction started to leak. So we put in buckets -- bucketfuls of code in obscure control events to buffer up changes to be written when the form closes.
Just when things were running smoothly, Sheridan decided to take that kid with his finger in the dike and send him to an orphanage. They "upgraded" the control. The upgrade was designed to make the control more efficient, of course... but we don't use the data control! It completely broke all our code. Every single grid control in the application -- at least one and usually more in each of 200+ forms -- had to have all-new buckets installed to catch the leaks.
You may be wondering by now why we haven't switched to a better grid control. Sure enough, there are controls out there now that would meet 95% of our needs... but 1) that 5% has high client visibility and 2) the rest of the code works, by golly! No way we're going to rip it out unless we're absolutely forced to.
By the way, our application now compiles to a svelte 16.9 MEG...
Stressed? Me? Of course not. Stress is what a rubber band feels before it breaks, silly.
FIne it's relaibale becasue of acks, timeouts, adaptive re-transmit timeouts that take statistical averages of RTT times, exponential back-off and slow start, window acks which keep track of what bytes are received, etc.
So in your case of timing out N, re-tx'ing N, and then getting the repsonse to the first N back after sending the second N, you do two things:
1) Good! You got yr packet!
2) keep track of how many bytes you have received thsu far (TCP is not sending messages, it is sending a stream)
3) when you get the response from your second request, discard it, becuase you already received those bytes from the stream.
4) since you timed out, DON'T use the Round TRip Time for that reponse: slow down your expected RTT time, and THEN start measuring.
And guess what? If I unplug the NIC of the other machine, there is no reliable way of transmitting that data (assuming your destination machine isn't dual homed)- so I keep streaming bytes to a TCP socket and I don't find out my peer is gone for approx. 2 minutes.
WOW. There's nothing reliable about that boundary condition!
my point is TCP is reliable ENOUGH. But I wouldn't equate it with a Maytag warranty. It is not a panacea. Infact, for a closed homogenous network I wouldn't even consider it the best option. But if the boundary conditions fall within the acceptible fudge range (remember Real Time human grade systems are not 100% reliable, only 99.99999% and much of that is achieved through redundancy) your leaks are ok.
In the future, I would want to not be isolated from my friends in the Space Station.
The market rewards abstractions because they help create high level tools that get products on the market faster. Classic case in point is WordPerfect. They couldn't get their early assembler-based product out on a competitive schedule with Word or other C based programs.
Proper abstractions avoid unintended side-effects by presenting a clean view of the intent and function of a given interface, and not just a collection of methods or structures.
When I read what Joel wrote about "leaky abstractions" i saw a peice complaining about "unintended side-effects". I don't think the problem is with abstractions themselves, but rather the implementation.
He lists some examples:
1. TCP - This is a common one. Not only does TCP itself have peculiar behavior in less than ideal conditions, but it is also interfaced with via sockets, which compound the problem with an overly complex API.
If you were to improve on this and present a clean reliable stream transport abstraction is would likely have a simple connection establishment interface and some simple read/write functionality. Errors would be propagated up to a user via exceptions or event handlers. But the point I want to make is that This problem can be solved with a cleaner abstraction.
2. SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries. There is no way to ensure that a query runs fastest on all systems. Every system and environment is going to have different amounts and types of data. The amount of data in a table, the way it is indexed, and the relationship between records is what determines a queries speed. There will always be manual performance tweaking of truly complex SQL simply because every scenario is different and the best solution will vary.
3. C++ string classes. I think this is another straw man. Templates and pointers in C++ are hard. That is all there is too it. Most Visual Basic only coders will not be able to wrap their minds around the logic that is required to write complex c++ template code. No matter how good the abstractions get in C++, you will always have pointers, templates, and complexity. Sorry Joel, your VB coders are going to have to avoid c++ forever. There is simply no way around it. This abstraction was never meant to make things simple enough for Joe Programmer, but rather to provide an extensible, flexible tool for the programmer to use when dealing with string data. Most of the time this is simpler, sometimes it is more complex (try writing your own derived string class - there are a number of required constructors you must implement which are far from obvious) but the end result is that you have a flexible tool, not a leaky abstraction.
There are some other examples, but you see the point. I think Joel has a good idea brewing regarding abstractions, complexity, and managing dependencies and unintended side-effects, but I do not think the problem is anywhere near as clear cut as he presents. As a discipline software engineering has a horrible track record of implementing arcane and overly complex abstractions for network programming (sockets and XTI) generic programming (templates, ref counting, custom allocators) and even operating systems API's (POSIX).
Until we can leave behind all of the cruft and failed experiments of the past, start new with complete and simple abstractions that do not mask behavior, but rather recognize it and provide a mechansim to handle it gracefully, we will run into these problems.
Luckily, such problems are fixable - just write the code. If joel were right and complex abstractions were fundamentally flawed, that would be a dark picture indeed for the future of software engineering (it is only going to grow ever more complex from here kids - make no mistake about it).
We have examples of massively complex systems that work very reliably day-in and day-out. Jet airplanes, for one; the national communications infrastructure, for another. Airplanes are, on the whole, amazingly reliable. The communications infrastructure, on the other hand, suffers numerous small faults, but they're quickly corrected and we go on. Both have some obvious leaky abstractions.
The argument works out to be pessimism, pure and simple -- and unwarrented pessimism to boot. If it were true that things were all that bad, programmers would all _need_ to understand, in gruesome detail, the microarchitectures they're coding to, how instructions are executed, the full intricacies of the compiler, etc. All of these are leaky abstractions from time to time. They'd also need to understand every line of libc, the entire design of X11 top to bottom, and how their disk device driver works. For almost everyone, this simply isn't true. How many web designers, or even communications applications writers, know -- to the specification level -- how TCP/IP works? How many non-commo programmers?
The point is that sometimes you need to know a _little bit_ about the place where the abstraction can leak. You don't need to know the lower layer exhaustively. A truly competant C programmer may need to know a bit about the architecture of their platform (or not -- it's better to write portable code) but they surely do not need to be a competant assembly programmer. A competant web designer may need to know something about HTML, but not the full intricacies of it. And so forth.
Yes, the abstractions leak. Sometimes you get around this by having one person who knows the lower layer inside and out. Sometimes you delve down into the abstraction yourself. And sometimes, you say that, if the form fails because it needs JavaScript and the user turned off JavaScript, it's the user's fault and mandate JavaScript be turned on -- in fact, a _good_ high-level tool would generate defensive code to put a message on the user's screen telling them that, in the absence of JavaScript, things will fail (i.e. the tool itself can save the programmer from the leaky abstraction).
What Ian Malcolm says, when you boil it all down, is that complex systems simply can't work in a sustained fashion. We have numerous examples which disprove the theory. That doesn't mean that we don't need to worry about failure cases, it means we overengineer and build in failsafes and error-correcting logic and so forth. What Joel Spolsky says is that you can't abstract away complexity because the abstractions leak. Again, there are numerous examples where we've done exactly that, and the abstraction has performed perfectly adequately for the vast majority of users. Someone needs to understand the complex part and maintain the abstraction -- the rest of us can get on with what we're doing, which may be just as complex, one layer up. We can, and do, stand on the shoulders of giants all the time -- we don't need to fully understand the giants to make use of their work.
I've heard a lot of people say that they can't believe how many homes, schools, and other buildings were destroyed by the huge thunderstorms that hit the states this past weekend, or that many people died. Hello, we haven't yet figured out how to control everything! American (middle to upper-class) life is a leaky abstaction. We find this out when we have a hard time coping with natural things that shake up our perfect (abstacted) world. That is what we all need to understand.
Neal Stephenson talks about something similar in In the Beginning was the Command Line. He calls it interface shear; he's specificially referring the the UI as an abstraction (an interesting idea in itself). His take on it was that abstractions are metaphors, and that "interface shear"/"leaky abstractions" occur in regions where the metaphors break down.
Interesting stuff...
In physics the abstractions leak. Newton's laws leak like crazy. Einstein's theories leak. Presently there are no fundamental theories in physics which don't leak like crazy when quantum mechanics and gravity interact.
In sports the abstractions leak. That's how we get players like Gretzky and pay a lot of money to watch what they do.
And how about the reason why didn't C++ didn't define a native string type. Because there isn't any way to implement a string class that serves all possible applications. The premise of C++ is not being stuck with someone else's choice on what part of the abstraction should leak. Because C++ doesn't define a native string type, the user is free to replace the default standard string implementation with any other string implementation and have it integrate with the language on an equal footing with the standard string type.
If a language is imposes standard abstractions it only takes one abstraction you can't live with to make that choice of language untenable. Which is how C++ has been so successful despite being the worst of all possible languages (except for all the others).
While I usually like Joel's work, I'm pissed about the random jab at C++. For those he didn't read the article, he says something along the lines of
"A lot of the stuff the C++ committe added to the language was to support a string class. Why didn't they just add a built-in string type?"
It's good that a string class wasn't added, because that lead to templates being added! And templates are the greatest thing, ever!
The comment shows a total lack of understanding of post-template, modern C++. People are free not to like C++ (or aspects of it) and to disagree with me about templates, of course, and in that case I'm fine with them taking stabs at it. But I get peeved when people who have just given the language a cursory glance try to fault it. If you haven't used stuff like Loki or Boost, or taken a look at some of the fascinating new design techniques that C++ has enabled, then you're in no place to comment about the language. At least read something like the newer editions of D&E or "The C++ Programming Language" then read "Modern C++" before spouting off.
PS> Of course, I'm not accusing the author of being unknowledgable about C++ or anything of the sort. I'm just saying that this particular comment sounded rather n00b'ish, so to speak.
A deep unwavering belief is a sure sign you're missing something...
Loved this article. Sent it on to my manager and a co-worker.
.NET effectively. Unless one understands some of the underpinnings of this NEW technology, you actually can't take advantage of it. Throw in the generated code issues and the IDE, an abstraction of an abstraction, really is disadvantageous.
One thing I liked especially is the danger of the Shiny New Thing. It may be neat and cool and save time, but knowing how to use it does not mean that you can do anything else - or function outside of it.
Right now I'm on an ASP.NET project - and some ASP.NET stuff I actually like. But the IDE actually makes it harder to program responsibly, and even utilize
A friend of mine just about strangled some web developers he worked with as they ONLY use tools (and they love all the Shiny New Ones) and barely know what the tools produce. This has led to hideous issues of having to configure servers and designs to work with their products as opposed to them actually knowing how they work. The guy's a saint, I swear.
I think managers and employers need to be aware of how abstract things can get, and realize good programmers can "drill down" from one layer to another to fix things. A Shiny New Thing made with Shiny New Things does NOT mean the people who did it are talented programmers, or that they can haul your butt out of a jam when the Shiny New Thing looses its shine.
"The Sage treasures Unity and measures all things by it" - Lao Tzu
Please tell me what you think of this - I would honestly like to know.
I've worked in a way similar to you, and I might still if it were as mindlessly simple to write assembly language programs under Windows as it was back in the day of smaller machines (i.e. no linker, no ugly DLL calling conventions, smaller instruction set, etc.). In addition to being fun, I agree in that assembly language is very useful when you need to develop your own abstractions that are very different from other languages, but it's a fine line. First, you have to really gain something substantial, not just a few microseconds of execution time and an executable that's ten kilobytes smaller. And second, sometimes you *think* you're developing a simpler abstraction, but by the time you're done you really haven't gained anything. It's like the classic newbie mistake of thinking that it's trivial to write a faster memcpy.
These days, I prefer to work the opposite way in these situations. Rather than writing directly in assembly, I try to come up with a workable abstraction. Then I write a simple interpreter for that abstraction in as high a level language as I can (e.g. Lisp, Prolog). Then I work on ways of mechanically optimizing that symbolic representation, and eventually generate code (whether for a virtual machine or an existing assembly language). This is the best of both worlds: You get your own abstraction, you can work with assembly language, but you can mechanically handle the niggling details. If I come up with an optimization, then I can implement it, re-convert my symbolic code, and there it is. This assumes you're comfortable with the kind of programming promoted in books the _Structure and Interpretation of Computer Programs_ (maybe the best programming book ever written). To some extent, this is what you are doing with your macros, but you're working on a much lower level.