Code Is Not Literature
An anonymous reader writes "Hacker and author Peter Seibel has done a lot of work to adopt one of the most widely-accepted practices toward becoming a better programmer: reading high quality code. He's set up code-reading groups and interviewed other programmers to see what code they read. But he's come to learn that the overwhelming majority of programmers don't practice what they preach. Why? He says, 'We don't read code, we decode it. We examine it. A piece of code is not literature; it is a specimen.' He relates an anecdote from Donald Knuth about figuring out a Fortran compiler, and indeed, it reads more like a 'scientific investigation' than the process we refer to as 'reading.' Seibel is now changing his code-reading group to account for this: 'So instead of trying to pick out a piece of code and reading it and then discussing it like a bunch of Comp Lit. grad students, I think a better model is for one of us to play the role of a 19th century naturalist returning from a trip to some exotic island to present to the local scientific society a discussion of the crazy beetles they found.'"
...works much better as a model. Performing music is analogous to executing code.
Code is very similar to language. How would it not be?
However, what's being described is entirely different. A narrative relies on both clear expression of the action and a broad context of details to give it resonance.
Code, on the other hand, operates through loops and definitions independent of timeline, so is a better match for architecture and math than the science of communications.
Futurist Traditionalism
Since when was I required to do code reading as an exercise? I've never heard of such a thing.
Can people stop inventing stupid new things I must do to be the perfect programmer?
Do not invite me to your code reading club. I'll decline the invitation.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
Welcome back. Have you been on vacation or in the slammer?
Discussing the meta-narrative implied by errant GOTO statements? Considering the motivations of while loops? Debating the thematic development in variable naming schemes?
Learning from anything means analyzing it, knowing what goes where for what reason, and then thinking about if there are other ways to do it, better ways to do it, if it needs to be done at all, etc. When you're reading with the intent to understand and you're doing something that's not appreciably different from simply 'viewing words' you're doing something wrong.
There is a (rather small) minority view that code can actually improve our ability to think - http://www.jsoftware.com/jwiki... . However, the bulk of opinion sees code as an obstacle to be overcome - rightly so, given the sloppy, ad-hoc construction of most programming languages.
The code is not a dump truck. The code is a series of tubes...
You do not have a moral or legal right to do absolutely anything you want.
smit is that you?
80% of the code of a program is uninteresting and mundane. As a programmer, you want to get to the core nugget of a problem, suss it out and solve it. "Reading" code is the same thing. You want to decode the structure, find the interesting 20% and move on to gleaning whatever you can from it. It's in our nature.
When writing code, your audience is not the compiler.
Your audience is another human being who will be maintaining that code a few years later.
If your audience were the compiler, then your code would just need to compile and run. It could be ugly. Unreadable. Unmaintainable. Uncommented. Have meaningless identifiers. Poor organization. Follow worst practices. Etc. In short, the kind of code you get from an outsourced contractor.
Consider that another human is your audience. Choose identifiers such that a comment is unnecessary. Comments should not say what is obvious. (This assigns foo to x.) Comments should say what is not obvious and cannot be made obvious by the code itself.
Write your code almost as if you are writing literature.
I'll see your senator, and I'll raise you two judges.
Code is a tool someone has built. It is not an attempt at communication. It is a language in a similar sense that mathematics is a language.It has more in common with a building or a car than a work of literature.
Code itself is simply a set of rules tying words and symbols to operations on a system. Learning those rules won't make you better at anything but learning rules. What will help you develop as a thinker is learning the underlying theory and ideas of a closely related field -- computer science. Thinking up your own solution to the dining philosophers problem, the knapsack problem or even understanding how you can describe the solution to the towers of Hanoi as an iterative process all help you develop problem solving skills and grant deeper insight into solving other problems. Simply learning a new coding language (unless that language is interestingly 'conceptually' (for lack of a better word) different from one you already know, like learning LISP when all you know is BASIC) won't improve much.
welcome to /chan .... *facepalm* ....
the support group for in-bread neo nazi survivors of horse rape is over at stromfront , here is where the adults are talking so stfu and go back to your google image searches for BBC
It is still a method of communication, though. You can often tell a lot about the programmer and his state of mind at the time by reading his code. It's very easy to tell when they were confused about what they were trying to accomplish, how comfortable they were with the language they were using and whether or not they were in a hurry.
Early on in my career, I started with the assumption that the original programmer knew what he was doing. The more code I read, the more I realized that this is almost never the case. From my observations, it takes about a year for someone to come up to speed with a project, the business process for the company they're working at, and any code base that was already there. Longer if the company's business processes suck. Until then they're mildly to severely confused, and this is reflected in their code. Since a lot of programmers don't hang around at one company for much longer than that, most of the code that I've run across has been crap. The first inclination might be to rewrite it, but as you're starting on a new project you're also mildly to severely confused, so it's best just to study the crap closely and make minor improvements as the opportunities arise. A crap in the hand is worth two in the bush. Or something. Most of the time. I've run across a couple of what had to have been bottom-ten-percent programmers whose crap did end up requiring full rewrites. Coming into a C project where the programmer didn't realize strings are null terminated is a huge warning. C++ or Java code where everything inherits from everything else is also a warning.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Neo Nazis are in bread now? White, wheat or whole-wheat white?
I disagree completely. A program resembles literature on two levels.
First, the code itself uses an extremely rigid grammar to express the requirements of the program. This expression can be simple or complex; clear or muddled. The extent to which the author (in every sense of the term) expresses these clearly and elegantly determines how likely the code is to succeed at its original purpose as well as how easy it will be to maintain.
Secondly, the UI (if present) is also a realization of the ideas behind the program. The clarity with which the ideas are expressed in the UI will have a major impact on the usefulness of the program.
I do not see a fundamental distinction between decoding code and written language. Both are abstract symbols assembled to form constructs and actions according to a set of more or less flexible rules. Many of the higher parts of language such as metaphor also have corresponding aspects in coding. (e.g. patterns.)
And much like with literature which can be written in a multitude of languages, code can likewise be written in a multitude of languages. I think there are more similarities than differences.
Complex code is not just a specimen. It is an ecosystem, where your crazy beetle consumes crazy aphids and evades crazy ants, while simultaneously trying to reproduce itself and the entire ecosystem to the best of its ability. A crazy beetle would not eat anything standard, like say make or autoconf. No, his refined palate requires a more sofisticated tool like imake, cmake, or even an internally grown food made of ground up pythons. Eating and reproduction habits may also be equally crazy all on their own. For example, crazy beetle firefox can only reproduce when confined in a clean chroot with every consumable painstakingly arrange exactly like he wants it. Other crazy beetles sometimes refuse to eat when certain other crazy beetles are present. When let loose in an unfamiliar environment, crazy beetles sometimes quietly die for no apparent reason and intensive investigation may be required to uncover the cause of their demise. A biology degree may be helpful in such circumstances.
No no, certain parts of coding is very much like literature. The style of how you... branch based on a string, or how you implement event-driven coding, or how you distribute computing power.
There are a TON of ways to skin those cats and which way you do it is a matter of stylistic preference. It's fashion. The exact sort of subjective shindig that lit major whittle away their time with. It's like the difference between writing in first person or third person. And in certain places one way is very much better than the other.
But who the hell reads code for the stylistic appreciation? We read code because it's broken, we want to steal part of it, or we need it to do something else. That's not a stylistic issue, that's a mechanic wrenching on a car. Figuring out just what the hell it's doing is a different act than bickering how it could have been done better. Doing the first part pays a lot better than the second.
This guy has noticed that most people that read things are reading restaurant menus, technical documents, text books, grocery lists, and not novels. The writers of said material are doing it to get shit done rather than fretting about how they do it.
As a Literature Major/Programmer, let me start by admitting the obvious: of course code is not literature, it's code. That being said, there is far more in common between the two than most programmers realize. Just as you can write an essay that's easy to understand and follow, or hard, regardless of the topic of the essay, so can you write code that easy/hard to grok regardless of what it actually does.
In both cases a good writer tries to make the subject matter accessible to the reader, precisely so that they *don't* have to go on a scientific expedition just to grok it.
No one, but NO ONE likes to work with someone else's code. It's bad news when you have to do this at work. It's much more effective to rewrite the job from scratch after planning the functionality, encapsulation etc. etc. again. So to most people this type of job (whether it's like a line by line dissemination of a machine code tome or whether it's like an insectoidal dissection) is an unusual and unpracticed chore.
The purpose of existence is to make money.
I examine code much like I study chess tactics. When should I use pointers, smart pointers, or garbage collectors? What are the methods and advantages of using different paradigms? When I read other people's code I'm constantly examining and criticizing their choices and implementations. If I find something that works better than what I'm doing I change my habits. It's not like sitting down and reading a book; its more like trying to solve a chess puzzle.
Clean code is another matter entirely. All code should be written for the next programmer to be able to easily digest its intent and function, but making code easy to read over its functionality should be a programming sin. Not using threads because its harder for the next guy to debug is not an acceptable practice. If the code is not cleanly and clearly written it is very difficult to divine its true purpose.
I've imagined converting the text to something we can all naturally understand like sewer pipes and flows of water (data) in the third dimension.
You and many others. But people keep reinventing Labview, and it's always a bit of a mess. Not to mention change review and source control!
If you're really interested in that kind of abstraction you might want to take a look at Haskell with arrows and monads.
Unlike literature, Art, music, etc., code has very little redundancy. In fact, low redundancy is a quality measure for code. Hence "reading" is an entirely unsuitable approach, as "reading" relies on relatively high redundancy on all layers.
IMO somebody that does not understand this beforehand does not have any real understanding of what code is.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
"Comments should say what is not obvious and cannot be made obvious by the code itself."
Comments should explain the WHY, not the WHAT.
e.g. /* cannot do <obvious first choice> because <explanation> */ or /* X can never be Y because <proof> */
Plain English (a movement by the osmosian order) is a selfcompiling programming language composed of literally plain english
I don't see how any programmer would think code was literature, except perhaps highly technical literature. You read novels from beginning to end. You read code on an as-needed basis. You might only read the header of a library. In fact I've seen good libraries where the only docs I read were long comments in the header file. If you want to understand a system you might start with main() or your language's equivalent and find some kind of dispatch function with calls to things like ResizeWindow which is *boring* and calls to things like DetectThief which is *interesting* so you drill down into the DetectThief function and find out where it gets the data and how it decides the user might be a thief. That might only be a few thousands lines that you've looked at. The other 30k lines of GUI or sorting, or options of writing PDF reports... blah, it might not be interesting to you... unless you're a font and layout geek and the reports did something interesting with fonts and/or layouts. Then you might only read that part.
Likewise, if it crashes you'll pull it up in the debugger and read parts of the functions on the stack that lead to the crash. Aha! The contract called for the caller to not pass any NULL pointers, and they passed one. Fix. Commit. You had a *reason* for reading that code.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
Knuth disagrees, which is why he created Literate Programming. If you aren't familiar with it, you should make yourself familiar. He suggests eventually someone might win a prize for literature from their code.
If you haven't seen it, you should check it out. His code has a table of contents, and could definitely be considered literature. His Tex code is so well organized, that you can find what you are looking for within 15 minutes, even if you're not really familiar with it. That's how code should be: written so other people can read it.
That's not what the author is talking about, though. He's talking about crappy code that wasn't written in a way that was easy to understand (I read the article; this is my understanding of it). So yeah, crappy code is not literature, or easy to 'decode.' Tautological.
"First they came for the slanderers and i said nothing."
Sounds like he needs to read "The Chosen" by Chaim Potok, where he'd find another analogy. Then he could try the chevruta model.
Most of the code I see is spaghetti!
Reading a program is more like reading a mathematical proof.
It shows what the author understood as the problem, and the approach to a solution.
Thus it is much more like a "peer review" of a mathematical proof prior publication.
The only surprising thing is that it's taken this long for him to figure this out.
Perl jokes aside, I have some old code written in everything from bash to C to R to Java. The common theme among these absolutely stunning pieces of literature is how incomprehensible some of it can be just a few months later. Sure, good code is self documenting, good code reads like a sentence, a proper module fits on one page of screen (I have a 24" display with better than 1920x1080 resolution, btw) but if my code were indeed prose, it would cause eyes to bleed, to hemorrhage, to explode in a fantastic fountain of blood and aqueous fluid.
Sometimes I wrote bits of code without knowing that there were easier ways. I may do a "for item in $(ls *.csv)" instead of the proper "for item in *.csv" or some furious hackery to manually rotate 20x10 matrix into a 10x20 (single command in several languages), or try to parse an XML file by regex'ing and other madness... Sometimes I was drunk. There was one class where the instructor didn't like "showoffs" so code had to be written using only the commands that were covered in the lecture. The resulting code from that class was horrid. One of my earliest bits of code from the 80s sent escape sequences to a printer and there are several strings with non-ASCII characters. There is no way to understand the code without knowing the printer. I have similar code for an Atari that stored music in a BASIC string. That might be possible to decode only if one understood how the Atari made sound.
This may explain why the incredibly ancient feature request for an "Outline View" in OpenOffice has gone over a decade (Reported: 2002-04-10) with no resolution.
https://issues.apache.org/ooo/...
The mental mapping of code for programmers and the mental mapping of text to those of us who write literature and non-fiction is totally different. They can't visualize how an outline and headings and the cues fonts give readers differs from all the "mind maps", "document navigators", and other inadequate replacements they keep suggesting will fill the need.
obviously white-only
We don't read poetry, we decode it. Or maybe you would say that we interpret it? Depends upon your point of view. We don't read the original article, we skim it.
The original author is romanticizing the term literature, not that there is anything wrong with that of course, but literature is a term applied to everything from Dostoevsky to instructions for assembling a toy. Beautifully/Dreadfully written code could be labeled as art, poetry, literature, garbage, puzzling, cryptography, and a whole variety of other terms.
With all that being said and putting aside that I do not agree with the original author's definition of literature, I do appreciate their perspective.
Just reading the summary, but this guy sounds off-base. There are different "densities" of writing, ranging from children's books and newspapers on one end, to stuff like academic journals, math, and poetry on the other end. Those latter types expect a much slower rate of progress, and more questioning-probing-ruminating on the text as you go. As usual, computer code is a lot like math. The author need not go so far afield for a good analogy.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
we could just treat code as though it were code.
Oil is not Tea.
No more to ad.
Why is it so hard to only have politicians for a few years, then have them go away?
Heh. My favorite bit of code written for 1st-year C class was a simple function to convert grades from a numeric percentage to a character representing the letter grade. (Breaks were inclusive and at 10% increments, so 90-100%=A,80-89%=B, etc.) This problem was of course assigned at the end of the lecture introducing switch/case.
Obligatory http://en.wikipedia.org/wiki/Black_Perl
Good code is never nearly as self-documenting as people think. I have re-written stuff just giving variables and functions more accurate names, and you still should have comments saying what is going on and why.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
When he said " A piece of code is not literature; it is a specimen.'" I was thinking that might apply to HIS code but that does not describe the code that good coders are trying to produce.
Also "specimen" implies the code is dead, when most code is very much alive and in flux...
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Oops, hit submit instead of preview, and forgot to log in... Take 2:
There was one class where the instructor didn't like "showoffs" so code had to be written using only the commands that were covered in the lecture.
Heh. My favorite bit of code written for 1st-year C class was a simple function to convert grades from a numeric percentage to a character representing the letter grade. (Breaks were inclusive and at 10% increments, so 90-100%=A,80-89%=B, etc.) This problem was of course assigned at the end of the lecture introducing switch/case.
My code was something like:
(I'm sure it's obvious enough, but integer division does the magic; multiplying by 5, then dividing by 4, maps 0-3 to 0-3, but maps 4 to 5; this skips E to give the proper sequence of ABCDF.)
Actually, I may have handled the <50 case inline with the conditional operator instead (e.g. percent>50?percent:50), not sure. (I went through a conditional-operator-happy phase in school, having not yet realized how frequently it just makes code harder to read.)
Anyway, my instructor fortunately didn't have a policy like that, and knew me enough to know I grokked switch/case anyway. He told me later that he enjoyed showoffs, because grading normal solutions was tiresome, and a few minutes examining some other approach was a welcome break.
Probably the best analogy is Architecture. There is a discipline that necessarily has a functional purpose, but still can (and often is) viewed and appreciated as art. A large part of the appreciation of architecture is appreciation of how it went about achieving its functional purpose, and there's a large body of theory build up around this. For example, its is a controversial but generally accepted architectural principle that form should follow function. An implication of this is that unnecessary architectural features are frowned upon. In SW Engineering we call non-functional code "dead code" if it flat out can't be used, and "inelegant" if it is simply more than necessary. Both are generally frowned on.
So if you want to spend time systematically analyzing software as art, perhaps the Right way to go about it would be the way architectural reviewers do, not the way literature or "high art" reviewers do it.
I just realized that, if I follow the author's argument about decoding rather than reading, James Joyce's two major works don't qualify as literature.
quiquid id est, timeo puellas et oscula dantes.
For anyone sufficiently familiar with the (American?) grading system, possibly. All others are left puzzling until they get to the end of that parenthetical remark.
The Tao of math: The numbers you can count are not the real numbers.
Pham Nuwen would understand.
What is going on is documented by the code itself. Why is another matter - comment on the why!
Translation:
1: Don't code drunk.
2: Self documenting code isn't actually self documenting.
3: Non-self documenting code (hardware-specific strings/code) is also not self documenting.
Therefore, don't code while drunk, and comment your code. Even if it seems obvious, summarize any block of code, so when you come back to it months (or years!) later you'll have a clue of what it was supposed to do, and then maybe you'll be able to figure out what you were trying to do in the first place.
This is based on my experience of looking at code I wrote six months or more ago that worked perfectly fine but needed some new feature included. My rule of thumb nowadays is, "If a block of code is more than three lines long, it should be commented. If it's three lines or less long, it might need commenting, too.
Sure I'm paranoid, but am I paranoid enough?
Thank you for your comment this is why I come to /. still
Good code is code you can read; code should be good code, but often isn't; code that is not good does not appeal to our lingustic intuition, or its associated aesthetics; most code must be decided, and that is because of the poor quality of authorship that results when people believe that it only matters if the program makes sense to the programmer, appears to behave correctly now, and is sufficient to get paid. We set an overly low bar for ourselves, despite lessons of eg Knuth and Moore, and we reap the consequences: bugs and bad design.
Code drunk, but keep git handy. For hangover, take one git reset --hard HEAD.
Sad to say, but this is not always possible in the real world. At work, I have a task queue about 30 deep. I.e., on any given day, there are 30 different tasks to complete before they reach SLA. On some days the deadline is just a few hours away so these tasks, regardless of complexity, get hammered on. They can be simple bug fixes ("menu not selectable when in Chrome"), complex bug fixes ("New version of Chrome changes test case render"), performance fixes both simple and complex, and the dumb ("marketing wants snowflakes to fall; blizzard effect follows mouse"). So I hammer those out at a furious clip, fixing them all then signing into Remedy to mark them as completed before they hit SLA because my performance review is tied to it. I don't bother so much with comments and beautiful code.
Beautiful code I reserve for hacking on Linux and obscure libraries to calculate poker odds.
Thanks for the article /. All these years I've been developing code thinking I was writing a book for people to read, thanks for setting me straight and not wasting a minute of my life reading this crap
He did interviews with programmers to figure out that? He isn't a hacker! He is a psychiatrist!
I'm a good programmer because I did it to myself, I didn't just read high quality code! Most of the technicians who quit programming do it because they face unreadable source code.
Hate me, but.. If someone wants to be the best programmer of the galaxy, just wish for it. Don't copy code: Learn how a system works.
Oh crap, that pretentious oaf Nick Montfort is going to get involved at some point isn't he? He'll probably just copy and past the comments into his word processor and produce a new "insightful best-seller".... Submitted as an AC due to moderation points....
The problem is when you move on and your replacement needs to get up to speed, or you need to go back two years later to fix a corner case.
Assembly code in particular can be hard to follow. Among other things, if I'm writing assembly that ties into higher-level language code I find it helpful to include human-readable descriptions of what the various registers are on entry and on exit. I'll also break up the code into logical blocks and include a comment describing at a high level what is happening in each block.
In India, software code is copyrighted as "art work". Indeed it is. Quoting someone familiar to us all - "In the work of art the truth of an entity has set itself to work."