Slashdot Mirror


Learning Computer Science via Assembly Language

johnnyb writes " A new book was just released which is based on a new concept - teaching computer science through assembly language (Linux x86 assembly language, to be exact). This book teaches how the machine itself operates, rather than just the language. I've found that the key difference between mediocre and excellent programmers is whether or not they know assembly language. Those that do tend to understand computers themselves at a much deeper level. Although unheard of today, this concept isn't really all that new -- there used to not be much choice in years past. Apple computers came with only BASIC and assembly language, and there were books available on assembly language for kids. This is why the old-timers are often viewed as 'wizards': they had to know assembly language programming. Perhaps this current obsession with learning using 'easy' languages is the wrong way to do things. High-level languages are great, but learning them will never teach you about computers. Perhaps it's time that computer science curriculums start teaching assembly language first."

17 of 1,328 comments (clear)

  1. Computer Science Curriculums by pointzero · · Score: 3, Interesting

    Perhaps it's time that computer science curriculums start teaching assembly language first.
    Here at University of New-Brunswick (Canada), they may not teach assembly language "first", but we do have a second year course dedicated to assembly and the inner workings of computers. My only problem though at the school is that we learn the Motorola M68HC11 cpu and not current ones. Sure it's easier to learn and understand, but most computers we work on today are x86 based.

    My 2 cents.

  2. For the record by pclminion · · Score: 5, Interesting
    The CS program at Portland State University starts with assembly in their introductory classes. At the time I was there, it was x86 assembly, but I've heard that some professors are using Sparc assembly as well -- not a good idea in my opinion, simply because of 1) the delay slot and 2) the sethi instruction, both of which are a little confusing for someone who's never coded before, let alone never coded in assembly language.

    I think it's a little weird to call this "Learning Computer Science via Assembly Language." It's programming, not computer science. Computer science is really only marginally about computers. It has to do more with algorithms, logic, and mathematics.

    You can study computer science, and produce new knowledge in the field, without ever touching a computer.

    This misunderstanding is, I think, part of the reason so many students drop out of CompSci. They head into it thinking it's about programming, and are startled to find that computation and programming are not equivalent.

    That's why the Compilers course at PSU is considered the "filter" which kills all the students who aren't really interested in computer science. They really need to spin off a seperate "Software engineering" school for these students, since what they really want to study is programming.

  3. Syntax, OS interfaces... by Cryptnotic · · Score: 5, Interesting

    Well, for starters the syntax for assemblers is different. There are two standards, the AT&T standard (which is used by the GNU assembler) and the other one that is more familiar to DOS/Windows x86 assembly programmers (which is used by the NASM assmebler).

    Second, OS interfaces for making system calls (e.g., to read files, open network connections, etc) are different in Linux versus DOS or Windows).

    --
    My other first post is car post.
  4. Re:Linux x86 assembly? by pla · · Score: 5, Interesting

    Is "Linux x86 assembly" any different to any other kind of "x86 assembly"?

    Yes. Although it requires understanding the CPU's native capabilities to the same degree, Linux uses AT&T syntax, whereas most of the Wintel world uses (unsurprisingly) Intel/Microsoft syntax.

    Personally, although I far prefer coding C under Linux, I prefer Intel syntax assembly. Even with many years of coding experience, I find AT&T syntax unneccessarily convoluted and somewhat difficult to quickly read through.

    The larger idea holds, however, regardless of what assembler you use. I wholeheartedly agree with the FP - People who know assembly produce better code by almost any measurement except "object-oriented-ness", which assembly makes difficult to an extreme. On that same note, I consider that as one of the better arguments against OO code - It simply does not map well to real-world CPUs, thus introducing inefficiencies in the translation to something the CPU does handle natively.

  5. Re:Your book? by FreshFunk510 · · Score: 5, Interesting

    Ugh. Johnnyb should've wrote a disclaimer that he was promoting his own book. This type of action turns me away from wanting to support such an individual. Sorry, nothing personal.

    --


    "Injustice anywhere is a threat to justice everywhere." - Martin Luther King, Jr.
  6. Re:Linux x86 assembly? by skurk · · Score: 5, Interesting

    You have a good point. I code assembly on many different CPU's, and I can only see a minor difference between them

    For example, on the 6502 family (like the 6510 from the C64), you have only three registers; X, Y and A. These registers can only hold a byte each. Most of the variables you have are stored in zero pointers, a 255-byte range from address $00-$FF.

    Then the 68k CPU (as in the Amiga, Atari, etc) you have several more registers which can be used more freely. You have D0-D7 data registers and A0-A7 address registers. These can be operated as bytes, words or longwords as you wish, from wherever you want.

    The x86 assembly is written the "wrong way", and is pretty confusing at times. Where I would say "move.l 4,a6" on the 68k, I have to say "mov dx,4" on the x86. Takes a few minutes to adjust each time.

    Once you master assembly language on one CPU, it's pretty easy to switch to another.

    I still think the 680x0 series are the best.

    --
    www.6502asm.com - Code 6502 assembly or.. DIE!!
  7. Re:Linux x86 assembly? by pla · · Score: 5, Interesting

    maxim: cycles are cheap, people are expensive.

    True. This topic, however, goes beyond mere maximizing of program performance. Pur simply, if you know assembler, you can take the CPU's strengths and weaknesses into consideration while still writing readable, maintainable, "good" code. If you do not know assembly, you might produce simply beautiful code, but then have no clue why it runs like a three-legged dog.


    it is significantly better value to design and build a well architected OO solution

    Key phrase there, "well-architected". In practice, the entire idea of "object reuse" counts as a complete myth (I would say "lie", but since it seems like more of a self-deception, I woun't go that far). I have yet to see a project where more than a handful of objects from older code would provide any benefit at all, and even those that did required subclassing them to add and/or modify over half of their existing functionality. On the other hand, I have literally hundreds of vanilla-C functions I've written over the years from which I draw with almost every program I write, and that require no modification to work correctly (in honesty, the second time I use them, I usually need to modify them to generalize better, but after that, c'est fini).


    Who cares if it's not very efficient - it'll run twice as fast in 18 months

    Y'know, I once heard an amusing joke about that... "How can you tell a CS guy from a programmer?" "The CS guy writes code that either won't run on any machine you can fit on a single planet, or will run too slowly to serve its purpose until technology catches up with it in few decades". Something like tha - I killed the joke, but you get the idea.

    Yeah, computers constantly improve. But the clients want their shiny new software to run this year (if not last year, or at least on 5-year old equipment), not two years hence.

  8. Re:Your book? by FreshFunk510 · · Score: 5, Interesting

    Oh, most definitely.

    Had he admitted that it was his own book, I might've actually read about the book, read the summary and saw if I was interested (as I'm a developer and have a degree in CS).

    But when he doesn't admit that and writes obviously biased remarks regarding knowing assembly to be a good programmer, I can't help but view it skeptically. In fact when I saw the response that it was his book I didn't even bother reading it anymore. All the words he posted lose credibility.

    And considering I got Score: 5 quite quick I have a feeling other people would agree with me. People don't like being "tricked" into buying stuff. It's the same reason why people don't like vendor-lock in and hate Microshaft.

    If you tell people its' your book and you give an open and honest review/opinion regarding it people will actually respect that and read about the book. Hey, it's just my guess, but I think I'm right.

    --


    "Injustice anywhere is a threat to justice everywhere." - Martin Luther King, Jr.
  9. Re:Not So New Concept by Sire+Enaique · · Score: 3, Interesting

    It's just what somebody said earlier, it's not the language itself that's difficult to learn, it's the APIs.

    20 years ago hp calculators used ASM as their programming language - though it wasn't called asm by hp. I didn't realize it until I started learning Z80 asm and noticed it was almost the same as my hp-15C's language.

    Except for the hp-16C, those calculators weren't targetted at programmers, so I guess there must be lots of people out there who know asm without realizing they do...

  10. Re:Not So New Concept by cide1 · · Score: 5, Interesting

    I think people that have been out in the real world for a while don't realize how many languages are taught in school's today. I'm in Purdue's computer engineering program, and so far, I have had to learn c, c++, fotran, java, asm, python, bash, and ksh. I'm sure there are some others I have forgotten (like the little bit of perl I know). We also learn ABEL and VHDL, several version control systems and Makefiles. My roommate is taking an AI course right now which is all in scheme. People talk about the old days like their is lost knowledge, we still learn all that. I could write ATARI games in asm if I wanted to, but why do that when tools are available that let me do so much more? For learning, we don't have to learn assembly first anymore, you can start with any language. I think it is good to take a two pronged approach. Learn C first, and at the same time, start learning digital logic. C compilers are forgiving, and warn about obvious errors compared to assembly just doing exactly what you tell it do. When one is comfortable with both, I think learning assembly is much easier. Without this prior knowledge, you are just doing what you are told to do, you do not really understand it, and when you make mistakes, you have a harder time understanding why. After learning asm, it is easier to think like the machine, and easier to write efficient code in other languages.

    --
    -- the computer doesn't want any beer, no matter how much you think it does. NEVER, EVER feed your computer beer.
  11. Actually, they DON'T. by Ungrounded+Lightning · · Score: 5, Interesting

    People who know assembly produce better code by almost any measurement except "object-oriented-ness", which assembly makes difficult to an extreme.

    Actually, they don't.

    A study was done, some decades ago, on the issue of whether compilers were approaching the abilities of a good assembly programmer. The results were surprising:

    While a good assembly programmer could usually beat the compiler if he really hunkered down and applied himself to the particular piece of code, on the average his code would be worse - because he didn't maintain that focus on every line of every program.

    The programmer might know all the tricks. But the compiler knew MOST of the tricks, and applied them EVERYWHERE, ALL THE TIME.

    Potentially the programmer could still beat the compiler in reasonable time by focusing on the code that gets most of the execution. But the second part of Knuth's Law applies: "95% of the processor time is spent in 5% of the code - and it's NOT the 5% you THOUGHT it was." You have to do extra tuning passes AFTER the code is working to find and improve the REAL critical 5%. This typically was unnecessary in applications (though it would sometimes get done in OSes and some servers).

    This discovery lead directly to two things:

    1) Because a programmer can get so much more done and working right with a given time and effort using a compiler than using an assembler, and the compiler was emitting better assembly on the average, assember was abandoned for anything where it wasn't really necessary. That typically means:

    - A little bit in the kernel where it can't be avoided (typically bootup, the very start of the interrupt handling, and maybe context switching). (Unix System 6 kernel was 10k lines, of which 1.5k was assembler - and the assembly fraction got squeezed down from then on.)

    - A little bit in the libraries (typically the very start of a program and the system call subroutines)

    - Maybe a few tiny bits embedded in compiler code, to optimize the core of something slow.

    2) The replacement of microcoded CISC processors (i.e. PDP11, VAX, 68K) with RISC processors (i.e. SPARC, MIPS). (x86 was CISC but hung in there due to initera and cheapness.)

    Who cares if it takes three instructions instead of one to do some complex function, or if execution near jumps isn't straightforward? The compiler will crank out the three instructions and keep track of the funny execution sequence. Meanwhile you can shrink the processor and run the instructions at the microcode engine's speed - which can be increased further by reducing the nubmer of gates and length of wiring, and end up with a smaller chip (which means higher yeilds, which means making use of the next, faster, FAB technology sooner.)

    CISC pushed RISK out of general purpose processors again once the die sizes got big: You can use those extra gates for pipelining, branch prediction, and other stuff that lets you gain back more by parallelism than you lost by expanding the execution units. But it's still alive and well in embedded cores (where you need SOME crunch but want to use most of the silicon for other stuff) and in systems that don't need the absolute cutting-edge of speed or DO need a very low power-per-computation figure.

    The compiler advantage over an assembly programmer is extreme both with RISC and with a poorly-designed CISC instruction set (like the early x86es). Well-designed CISC instruction sets (like PDP11, VAX, and 68k) are tuned to simplify the compilers' work - which makes them understandable enough that the tricks are fewer and good code is easier for a human to write. This puts an assembly programmer back in the running. But on the average the compiler still wins.

    (But understanding how assembly instruction sets work, and how compilers work, are both useful for writing better code at the compiler level. Less so now that optimizers are really good - but the understanding is still helpful.)

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  12. Re:Not the point! by EugeneK · · Score: 3, Interesting


    AND(a,b) = XOR(XOR(a,b),OR(a,b))


    You might be right about the cost of gates; I have no idea. Ok, it was a dumb joke. :)

  13. Re:nah by dubious9 · · Score: 3, Interesting

    Yeah because you don't use variables, conditionals and types in assembly. Right. You just have to think harder about it.

    Also, you can use object oriented principles in assembly, just like you can in C. There aren't any convenient keywords or enforced methodologies (ie everything is an object), but you can gather sections of code that tie data and associated functionality together. You learn the advantages of modular programs. You learn how to count in binary. It forces you to not code carelessly. You learn good coding principles that you can apply to higher languages. In general you get another tool in your tool box. I'd say there are different flavors of programming and they aren't mutually exclusive. ie:

    Object Oriented
    Procedural
    Event-Based
    Interpreted
    Managed Memory (garbage collection)
    Assembly

    To leave one of these out of a CS program leaves me wondering how good the program is.

    --
    Why, o why must the sky fall when I've learned to fly?
  14. Re:Linux x86 assembly? by vsprintf · · Score: 4, Interesting

    I don't know why, but just saying the words 'assembly language', sends a chill down my spine. I guess I am too weak minded to learn it.

    Maybe individual brains just work in different ways. In school, I knew some people who were good with high-level languages but just couldn't hack assembler. They could not get down to that absolute minimal step-by-step instruction level. I'm not sure what that says about those of us who use assembler. :) BTW, I certainly don't advocate assembler as a first computer language - second, perhaps.

  15. Re:Linux x86 assembly? by Kosgrove · · Score: 4, Interesting

    I disagree. Assembly has little to nothing to do with either programming or computer science anymore. Computer science (IMHO) deals with the study of software engineering and algorithms (network protocols, etc). Computer engineering deals with custom integrated circuit development, including processor architecture.

    I learned assembly as an undergraduate at Penn State (before attending a year of grad school at Drexel), but what I got out of the course had far more to do with understanding architecture (something not relevant for most developers, but much more relevant for hardware engineers).

    Assembly does not have any of the high-level features (OOP, libraries, etc) features that developers need to know these days. It's rarely used, even in embedded programming since C/C++ compilers have gotten quite efficient and are available even for open-core (similar to open source) procesors for use on FPGA's.

    On the other hand, assembly is important to know for computer engineering undergrads and graduates interested in architecture, and having taught in the CompEng department there, I can say that the depth of assembly in the cirriculum there is not sufficient.

  16. From a ring counter to OOP by rcpitt · · Score: 4, Interesting
    The first piece of digital gear I ever had my hands on was the size of a small desk and about 5' high. It had just enough "logic" on it to create an eight bit ring counter (0 1 10 11 100...)

    That was in 1965 and the unit was the only one for our whole school district and worth about $30,000 (CDN - about $35000 US at the time)

    From there to today's Object Oriented Programming languages has been an interesting time. I wouldn't have missed it for anything, and I honestly think that living through it has given me a perspective that many more recent programmers don't have and IMHO need, sometimes.

    Where "brute force and ignorance" solutions are practical, there is no gain in knowing enough about the underlying hardware and bit twiddling to make things run 1000% faster after spending 6 months re-programming to manually optimize. In fact, since (C and other) compilers have become easily architecture tuned, there really are few areas where speed gains from hardware knowledge can be had, let alone made cost effective. Most are at the hardware interface level - the drivers - most recently USB for example.

    If you're happy with programming Visual Basic and your employer can afford the hardware costs that ramping up your single CPU "solution" to deal with millions of visitors instead of hundreds, then you don't need to know anything about the underlying hardware at the bit level.

    On the other hand, if you need to wring the most "visits" out of off-the-shelf hardware somehow, then you need to know enough to calculate the theoretical CPU cycles per visit.

    Somewhere between these two extremes lies reality.

    Today I use my hardware knowledge mostly as a "bullshit filter" when dealing with claims and statistics from various vendors. I have an empiric understanding of why (and under what circumstances) a processor with 512 Megs of level 1 cache and a front-side bus at 500MHz might be faster than a processor with 256 Megs of L1 cache and a 800MHz FSB and vice versa. Same thing for cost effectiveness of SCSI vs IDE when dealing with a database app vs. access to large images in a file system (something that came up today with a customer when spec'ing a pair of file servers, one for each type application)

    Back in the mid 70s I dealt with people who optimized applications on million $ machines capable of about 100 online users at one time. Today I deal with optimization on $1000-$3000 machines with thousands of 'active' sessions and millions of 'hits' per day. Different situations but similar problems. Major difference is in the cost of "throwing hardware at the problem" (and throwing the operating systems to go with the HW - but then I use Linux so not much of a difference ;)

    Bottom line is that understanding the underlying hardware helps me quite a bit - but only as a specialist in optimization and cost-effectiveness now, not in getting things to work at all as in the past.

    --
    Been there, done that, paid for the T-shirt
    and didn't get it
  17. Re:Somewhere in the middle... by Mandatory+Default · · Score: 3, Interesting

    It's clear you weren't writing code 30 years ago and you have no ideas what it was like back then. I was there. If you had ever tried to write an entire application in 2K, you'd know that every trick Mel played saved precious memory and/or cycles. Back then, hardware was expensive, people were cheap. 4K words of memory cost more than most people made in a year (and no, 4K words was not 4KB) - and that pricing was long after the drums described in this article. The optimizers were interesting toys, but that was about it.

    Your talking about documented code makes pretty clear how little you know on the subject. He was writing machine code. Not assembly. Not C. Not Perl. Machine code. There's nowhere to put comments. Even if there were, you'd never waste a machine word on them.

    If you've never had to get a forklist to help you lift your computer, you aren't qualified to pass judgment on Mel.