Slashdot Mirror


Historians Recreate Source Code of First 4004 Application

mcpublic writes "The team of 'digital archaeologists' who developed the technology behind the Intel Museum's 4004 microprocessor exhibit have done it again. 36 years after Intel introduced their first microprocessor on November 15, 1971, these computer historians have turned the spotlight on the first application software ever written for a general-purpose microprocessor: the Busicom 141-PF calculator. At the team's web site you can download and play with an authentic calculator simulator that sports a cool animated flowchart. Want to find out how Busicom's Masatoshi Shima compressed an entire four-function, printing calculator into only 1,024 bytes of ROM? Check out the newly recreated assembly language "source code," extensively analyzed, documented, and commented by the team's newest member: Hungary's Lajos Kintli. 'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.'"

38 of 159 comments (clear)

  1. Those were fun by certsoft · · Score: 4, Interesting

    Somewhere around 1975 or 1976 I wrote software for a 4004 (using a teletype connected to a modem connected to a mainframe someplace that had the assembler) to run a X-Y table. You would place a wafer with thick-film resistors on it and it would test each one to make sure it was within tolerance and if it wasn't it would mark it with magnetic ink. I think we were probably still using the infamous 1702 EPROMs but there might have been something newer at that time.

    1. Re:Those were fun by jacquesm · · Score: 2, Interesting

      somewhere around 1982 a buddy of mine and myself disassembled and commented microsoft's basic for the trs-80 color computer. Then we improved it with tons of new statements via the hook in ram. Documenting a bloody calculator is childs play compared to that and we weren't especially proud of it, just curious.

    2. Re:Those were fun by PopeRatzo · · Score: 2, Funny

      somewhere around 1982 a buddy of mine and myself disassembled and commented microsoft's basic for the trs-80 color computer.
      And you lived to tell the tale.
      --
      You are welcome on my lawn.
    3. Re:Those were fun by ozmanjusri · · Score: 2, Funny
      A little while later, he sold us the "first" 8008 in the area. Dick.

      Why the abuse?

      Did he overcharge you?

      --
      "I've got more toys than Teruhisa Kitahara."
    4. Re:Those were fun by jacquesm · · Score: 2, Interesting

      the tools we had were the Leventhal 6809 book, we wrote the disassembler (and the assembler) ourselves,
      to make it a little easier to relate to I said color computer but in fact it was a very little known
      clone called the Dragon 32 (which, incidentally as we found out had 64K that you could use if you
      pulled a few tricks).

      I wished I had known about OS/9 at the time (but this was long before the age of easy access to
      information and in Europe).

      But hey, why am I feeding the trolls... anonymous ones at that :)

      I guess it is because I wonder what has become of the software scene that we now have f'ing laws
      that stop kids from being curious and looking 'inside the box'.

  2. And best of all by Dusty · · Score: 4, Funny

    You can still run it on the latest Intel x86 chips. ;)

  3. slashdot headline, 2057: by circletimessquare · · Score: 4, Funny

    "Historians Recreate Source Code of First 404 Error Message"

    (truth be told, quick scanning the headlines, that's what my brain registered)

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  4. I hate to be a pendantic jerk, but... by gatekeep · · Score: 3, Funny

    "...an authentic calculator simulator..."

    What the hell is an authentic simulator?

    1. Re:I hate to be a pendantic jerk, but... by urcreepyneighbor · · Score: 5, Funny

      Your hand?

      zing!

      --
      "The fight for freedom has only just begun." - Geert Wilders
    2. Re:I hate to be a pendantic jerk, but... by Vegeta99 · · Score: 3, Funny

      Na way man, Jill never needs flowers or taken out to dinner. She's better than authentic!

  5. Re:Only 1024? by jfengel · · Score: 2, Interesting

    The original lacked a gui.

    And scientific functions.

    And the ability to convert hex.

    And store/recall.

    The original had 4 functions. This one has at least 40. Would you rather the MS guys spend time seeing if they can force their 114k application down into 10k, or perhaps writing an operating system that doesn't suck?

  6. Re:Only 1024? by DragonWriter · · Score: 5, Funny

    Would you rather the MS guys spend time seeing if they can force their 114k application down into 10k, or perhaps writing an operating system that doesn't suck?


    It'd be an improvement if MS did either.
  7. Re:Only 1024? by geekoid · · Score: 2, Funny

    I'm pretty sure it had a GUI. I'f I were to guess, I'd say it was buttons...possibly with numbers on them.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  8. Quickly -- someone send this to MS by Eberlin · · Score: 4, Funny

    Quick, someone send this over to the folks who wrote Excel!

  9. the output is by geekoid · · Score: 4, Funny

    58008

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  10. Commander Keen by QuantumG · · Score: 5, Interesting

    I once reverse engineered the classic id software game Commander Keen. John Carmack did some cool stuff in that code.. each sprite had two function pointers in it, one was called when the sprite came into contact with another sprite, the other was called every frame to animate the sprite (he called it the "think" function). When you killed a monster the sprite was replaced with a "body" which was just like a sprite but had a few less fields (so it took up less memory). One of the neatest things he did was use this exact same framework of sprites and bodies to animate the "static" parts of the game. For example, the color coded doors that you have to get the key cards to open were sprites with a contact function that checked if the player had the right key card, at which time they would "die" and be replaced by a body that had a think function would make them slide out of the way.

    For anyone who would like to take a look, I've put the re-engineered source code up.

    --
    How we know is more important than what we know.
    1. Re:Commander Keen by Cheesey · · Score: 4, Interesting
      Carmack's code is always interesting. Most famously, there's the infamous square root approximation from Quake. But I'm still impressed by the original Doom render loop, with it's self-modifying code.

      The loop is drawing columns (vertical slivers of wall). It needs to interpolate between two things: the input wall texture, and the output part of the screen. Carmack uses something like Bresenham's line drawing algorithm to do this, but because the 386 has such a limited register set, he stores the fractional increment in an immediate attached to the "addl" instruction:

      doubleloop:
          movl ecx,ebp // begin calculating third pixel
      patch1:
          addl ebp,12345678h // advance frac pointer
          movb [edi],al // write first pixel
          shrl ecx,25 // finish calculation for third pixel
          movl edx,ebp // begin calculating fourth pixel
      patch2:
          addl ebp,12345678h // advance frac pointer
          movl [edi+SCREENWIDTH],bl // write second pixel
          shrl edx,25 // finish calculation for fourth pixel
          movb al,[esi+ecx] // get third pixel
          addl edi,SCREENWIDTH*2 // advance to third pixel destination
          movb bl,[esi+edx] // get fourth pixel
          decl [loopcount] // done with loop?
          movb al,[eax] // color translate third pixel
          movb bl,[ebx] // color translate fourth pixel
          jnz doubleloop
      and elsewhere... :)

      movl ebx,[_dc_iscale]
          shll ebx,9
          movl eax,OFFSET patch1+2 // convice tasm to modify code...
          movl [eax],ebx
      A similarly impressive trick is used to draw floors, where 3D interpolation is required because each texture needs to be crossed diagonally, not vertically. I never understood how Doom drew floors until I looked at the code, and I still think it's deep magic. And that's without even mentioning the BSP code!
      --
      >north
      You're an immobile computer, remember?
    2. Re:Commander Keen by Hal_Porter · · Score: 2, Informative
      Damnit slashdot really doesn't like

      patch1:
          addl ebp,12345678h
      generates some code like this

      81 C5 78 56 34 12
      The first two bytes are addl,ebp and the rest are the constant.

      Now what Carmack wants to do here is to have an extra register to hold a constant so he overwrites the constant in the instruction. Patch1+2 is where the constant is

      movl ebx,[_dc_iscale] // get the constant
          shll ebx,9 // constant <<= 9
          movl eax,OFFSET patch1+2 // get the address of the constant in the instruction
          movl [eax],ebx // overwrite it in the instruction
      So essentially the code becomes

      patch1:
          addl ebp, _dc_iscale << 9
      What I'm not sure is why he didn't do addl ebp,[FractionalIncrement]. That would take the FractionalIncrement from a memory location. Then he wouldn't need the self modifying code, he could just write _dc_iscale << 9 there.

      On the other hand, maybe the code is memory limited anyway and this will slow it down.

      On a recent x86 processor self modifying code will be correct but not too fast. If code is not self modifying it will execute from the instruction cache and writes will be buffered and stored in writeback data cache. Self modifying code screws with this - the write needs to end up in the i cache somehow. And x86 opcodes are converted into a Risc like internal format for execution, and if these are cached you need to flush that cache too. And since they have to do it on old code they have to do all steps automatically with no hints from the programmer.

      On an ARM you just need to clean the dcache, flush the write buffer and flush the icache as ARM instructions are a executed directly. And all these are things the programmer has to ask for explicitly using MCR and MRC instructions. Actually ironically the ARM has more registers 13 instead of 6 so you have less need for this and you can encode shift counts for free, so in ARM you could do this

      patch1:
      ; Rx holds whatever was in ebp
      ; Ry is _dc_iscale
        ldr Rx, [Rx, Ry, lsl #9] // Rx = Rx + (Ry<<9)
      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
    3. Re:Commander Keen by StormReaver · · Score: 2

      "Carmack's code is always interesting. Most famously, there's the infamous square root approximation from Quake."

      That is indeed impressive code, but John claims he didn't write it. In fact, nobody at id has claimed authorship of it. It was speculated that perhaps Michael Abrash wrote it, but he denies authorship as well. My speculation is that it was a cool snippet of code floating around the public domain, and somebody at id had the good judgment to realize that it was significantly faster than the standard library sqrt().

  11. Re:uhhhh... by Cassius+Corodes · · Score: 2, Funny

    That is the correct answer - all modern calculators are descended from a competitor's model which incorrectly calculated 9+9 to be 18.

    --
    Control is an illusion, order our comforting lie. From chaos, through chaos, into chaos we fly
  12. How to build a CPU -- transistor level up! by compumike · · Score: 3, Informative

    Take a look at this set of videos from MIT's 6.004 Computation Structures class. They basically walk through the design of a simple 32-bit CPU from transistors, to gates, to functional blocks, to a full processor.

    Anyway, reading about how hard it was to recreate the source code from the 4004 makes me wonder how easily we could find source code for some apps from even a decade ago. Lots of companies have gone bankrupt / discontinued products / been sold / etc, and we all know that lots of people aren't good about backing up their code. It's neat to go to the Linux Kernel Archives and look at the Historic Linux sources.

    --
    Educational microcontroller kits for the digital generation.

  13. Amazing! by Reality+Master+101 · · Score: 4, Insightful

    'He is an amazing reverse-engineer,' recounts team leader Tim McNerney, 'We understood the disassembled calculator code well enough to simulate it, but Lajos really turned it into "source code" of the highest standards.'

    No disrespect to Lajos, but have we really fallen so far in programming standards that it's considered "amazing" to disassemble a 1024 byte program? Back in my day (and stay the hell off my lawn!) we used to disassemble programs all the time. I reverse engineered the operating system for a computer I developed for because we wanted to hook into places that weren't accessible.

    Disassembly is apparently a lost art in these decadent days of some programmers never using anything but scripting languages (e.g., Java, Python, Perl) and having no clue what goes on under the hood.

    --
    Sometimes it's best to just let stupid people be stupid.
    1. Re:Amazing! by Juliemac · · Score: 2, Funny

      The company I work for hired 4 programmers (from out of country) to re-work existing code and clear out known bugs. As a result, the log in no longer worked. 2 weeks later, the testers could get in, but none of the drop down boxes worked and more. Problem is they are wizards. They click and drop code with out understanding what the code does. The US trained programmers cant get the time of day from the head of IT.

    2. Re:Amazing! by dmonahan · · Score: 4, Interesting

      Sometime in the early 70s, a Honeywell division, one of our steady clients, called with a strange request. They had built a small number of special machines for the Navy. Now the Navy wanted more. Honeywell had the circuit drawings and the bootable tape (which they got from the Navy). They had no documentation (not even the instruction set). They asked us to rebuild the code. We did. Dick.

    3. Re:Amazing! by be-fan · · Score: 3, Insightful

      "Programs must be written for people to read, and only incidentally for machines to execute." - Abelson & Susman

      From a theoretical point of view, assembly knowledge isn't particularly useful because it doesn't lend itself to rigorous analysis (the "science" part of "computer science"). From a practical point of view, since very few programs are written in assembly language anymore, knowledge of it has limited utility. Further, from a practical point of view, I'd much rather deal with a programmer who can explain his work in terms of data structures and algorithms than one that is stuck thinking in terms of registers and memory locations.

      There is certainly a place for assembly knowledge*. It's just a niche, and not a particularly important one anymore. Meanwhile, there are lots and lots of diverse applications for the theory they teach you in those classes instead of assembly. In my own work, I've had to bust out the graph theory way more often than I've had to bust out my knowledge of asm tricks for fast line-rendering...

      *) Interestingly enough, one of those places is inside the language runtimes of high-level languages. There are usually lots of neat tricks inside those things (eg: using the NaN space of double-precision floats to store unions of floats and 51-bit integers without extra variant tags!)

      --
      A deep unwavering belief is a sure sign you're missing something...
  14. Re:but... by Anonymous Coward · · Score: 3, Funny

    Did it, but the ATI drivers still sucked.

  15. Where's the update? by lseltzer · · Score: 5, Funny

    I found a buffer overflow. Exploit code to follow...

  16. large function in small code by Have+Brain+Will+Rent · · Score: 2, Informative

    In 1970 the PDP series from DEC, e.g. PDP-8, had an interpreted (and used interactively) language called FOCAL, arrays (even sparse ones), real numbers, usual math and other functions, for loops, if statements blah blah blah... all the usual stuff - the entire interpreter *and* runtime was programmed in a total of 2K instructions (and they were primitive instructions). That was normal for the time.

    --
    The tyrant will always find a pretext for his tyranny - Aesop
    1. Re:large function in small code by cburley · · Score: 2, Interesting

      However, the PDP-8 was a 12-bit-word minicomputer that was designed for inexpensive general-purpose computing, whereas the 4004 was (IIRC) a "tiny" 4-bit-word microchip designed mainly for numerical control applications.

      I programmed both, the latter for a friend of mine when I was about 15 years old (he later basically got me my first "real" job as a Software Engineer at Pr1me), and the -8 was definitely much easier to program, with a much more powerful instruction set — the code my friend needed written would have been much easier to write, and perhaps even "smaller" (fewer instructions, maybe fewer bits in the instruction stream?), on the -8, though I confess to remembering too little about the 4004 to be really sure about that. (Of course, the -8 wasn't nearly as nifty a machine, instruction-set-wise, as the -11, or as all-out kick-butt powerful as the -10; I wrote much more assembly/machine code for the -10 than for the other DEC systems combined, and actually got to use -10's far more often, at timesharing companies like Comp/Utility and First Data Corp in Mass. where they ran TOPS-10, and at MIT in the AI lab, where ITS ruled!)

      So, all in all, I think the calculator-on-a-4004 is probably more impressive than FOCAL or BASIC on the -8, though FORTRAN on the -8 probably was no trivial accomplishment. But I haven't looked at the source/assembly/machine codes myself to make a proper assessment.

      (This seems so long ago now. That was around the time a James Bond movie came out with Roger Moore playing Bond. I recall watching it in the theatre, pretty good-sized crowd, and, I think early on in the movie, there's a scene where Bond is in bed and gets some kind of signal or alarm — I forget which — and looks at his watch, which is shown to the viewers and is an early-model LED watch. And I distinctly recall the reaction from the mostly-male audience when he pressed a button on the watch and it lit up with the time in red LED digits: "Ooooooohhhhh!". As it happens, I later became wealthy enough to buy myself a digital watch...but, sadly, not a Lotus Eclipse.)

      --
      Practice random senselessness and act kind of beautiful.
    2. Re:large function in small code by Have+Brain+Will+Rent · · Score: 2, Interesting

      I don't remember the 4004 being all *that* primitive but perhaps I'm thinking of the 8008. But even the 4004 was easier to program than an old Univac machine I used that used patch boards to set up the program - LOL. Regardless I wasn't comparing the relative difficulty between the two machines. I was just making the general observation that people were able to put quite sophisticated software on very primitive machines in what today would be considered microscopic amounts of memory. As for Fortran that actually wasn't that hard to implement on the PDP-8, I was much more impressed with Algol was implemented. IIRC the Algol compiler was user contributed, in the DECUS catalogue, along with a lot of other sophisticated (for the time) user contributed software.

      In fact DECUS may have been the very first organized attempt at general distribution of free and/or open source software.

      --
      The tyrant will always find a pretext for his tyranny - Aesop
    3. Re:large function in small code by FrenchSilk · · Score: 2, Interesting

      And I once wrote a full-featured symbolic assembler in 1579 bytes. Besides symbolic labels, it supported address expressions with +=/* and logical AND/OR, hex and text strings, and a lot more. To the best of my knowledge it is the smallest symbolic assembler ever written. I published and sold it as The Assembler for the VIC-20.

  17. Re:Something is wrong...... by bpharri2 · · Score: 4, Informative

    Of course if you had bothered to read the article, you'd know that it doesn't work like todays calculators but like the old adding machines:

    "The electronic calculators that accountants used 35 years ago worked differently than the familiar four-function calculator we use today. These were designed to behave much like mechanical adding machines of the 1960's. After every number you want to add to the total, you need to press +, so = doesn't work like you'd expect. Here are some examples:

    To add three numbers: 61 + 79 + 83 + = (if you forget the last +, the 83 won't get added)
    To subtract two numbers: 2007 + 1971 - =
    To multiply two numbers: 125 x 5 = (this is more like we're used to)
    To divide two numbers: 625 / 5 = "

  18. Re:Only 1024? by ppc_digger · · Score: 3, Informative
    They put all the actual code in a shared library:

    # ldd /usr/bin/kcalc
    libkdeinit_kcalc.so => /usr/lib/libkdeinit_kcalc.so (0x00002b1351db8000)
    ...

    # ls -lh /usr/lib/libkdeinit_kcalc.so
    -rw-r--r-- 1 root root 436K 2007-07-03 19:15 /usr/lib/libkdeinit_kcalc.so
    --
    Of all major operating systems, UNIX is the only one originally meant for gaming.
  19. Re:Only 1024? by TDRighteo · · Score: 5, Interesting
    Floating-point math doesn't fix itself. Let's not be hard on Microsoft when:

    Python 2.5.1 (r251:54863, Oct 30 2007, 13:54:11)
    [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 10.1-10-0.1
    -3.6082248300317588e-16
    and...

    $ perl
    printf("%s\n", 10.1-10-0.1);
    -3.60822483003176e-16
    and...

    $ php
    <?php
    echo (10.1-10-0.1);
    ?>
    -3.6082248300318E-16
    Note that the answers vary across languages too...
  20. Asimov's The Feeling of Power by drachenstern · · Score: 3, Interesting

    How has no one mentioned this yet? - Don't blame me too much, I just copied and pasted from: http://downlode.org/Etext/power.html

    The Feeling Of Power
    by Isaac Asimov

    Jehan Shuman was used to dealing with the men in authority on long-embattled earth. He was only a civilian but he originated programming patterns that resulted in self-directing war computers of the highest sort. Generals, consequently listened to him. Heads of congressional committees too.

    There was one of each in the special lounge of New Pentagon. General Weider was space-burned and had a small mouth puckered almost into a cipher. He smoked Denebian tobacco with the air of one whose patriotism was so notorious, he could be allowed such liberties.

    Shuman, tall, distinguished, and Programmer-first-class, faced them fearlessly.

    He said, "This, gentlemen, is Myron Aub."

    "The one with the unusual gift that you discovered quite by accident," said Congressman Brant placidly. "Ah." He inspected the little man with the egg-bald head with amiable curiosity.

    The little man, in return, twisted the fingers of his hands anxiously. He had never been near such great men before. He was only an aging low-grade technician who had long ago failed all tests designed to smoke out the gifted ones among mankind and had settled into the rut of unskilled labor. There was just this hobby of his that the great Programmer had found out about and was now making such a frightening fuss over.

    General Weider said, "I find this atmosphere of mystery childish."

    "You won't in a moment," said Shuman. "This is not something we can leak to the firstcomer. Aub!" There was something imperative about his manner of biting off that one-syllable name, but then he was a great Programmer speaking to a mere technician. "Aub! How much is nine times seven?"

    Aub hesitated a moment. His pale eyes glimmered with a feeble anxiety.

    "Sixty-three," he said.

    Congressman Brant lifted his eyebrows. "Is that right?"

    "Check it for yourself, Congressman."

    The congressman took out his pocket computer, nudged the milled edges twice, looked at its face as it lay there in the palm of his hand, and put it back. He said, "Is this the gift you brought us here to demonstrate. An illusionist?"

    "More than that, sir. Aub has memorized a few operations and with them he computes on paper."

    "A paper computer?" said the general. He looked pained.

    "No, sir," said Shuman patiently. "Not a paper computer. Simply a piece of paper. General, would you be so kind as to suggest a number?"

    "Seventeen," said the general.

    "And you, Congressman?"

    "Twenty-three."

    "Good! Aub, multiply those numbers, and please show the gentlemen your manner of doing it."

    "Yes, Programmer," said Aub, ducking his head. He fished a small pad out of one shirt pocket and an artist's hairline stylus out of the other. His forehead corrugated as he made painstaking marks on the paper.

    General Weider interrupted him sharply. "Let's see that."

    Aub passed him the paper, and Weider said, "Well, it looks like the figure seventeen."

    Congressman Brant nodded and said, "So it does, but I suppose anyone can copy figures off a computer. I think I could make a passable seventeen myself, even without practice."

    "If you will let Aub continue, gentlemen," said Shuman without heat.

    Aub continued, his hand trembling a little. Finally he said in a low voice, "The answer is three hundred and ninety-one."

    Congressman Brant took out his computer a second time and flicked it. "By Godfrey, so it is. How did he guess?"

    "No guess, Congressman," said Shuman. "He computed that result. He did it on this sheet of paper."

    "Humbug," said the general impatiently. "A computer is one thing and marks on a paper are another."

    "Explain, Aub," said Shuman.

    "Yes, Programmer. Well, gentlemen, I write down seventeen, and just undernea

    --
    2^3 * 31 * 647
  21. Re:Only 1024? by dotgain · · Score: 2, Funny

    I wonder why that acronym never caught on?

  22. 1024 Bytes? Bah! by LS · · Score: 2, Interesting

    How about 256 bytes for a 3D rotating parallax tunnel fly-through !!!

    LS

    --
    There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
  23. Please post this code. by Simonetta · · Score: 2, Interesting

    Thousands of people now and in the future would be interested in studying this code. Please dig up and post this work. Perhaps to one of the 'vintage computer' websites.

        People are still writing assembler code for tiny microprocessors. However now it is being done for very inexpensive microcontrollers like the Atmel AVR and the Microchip PIC. This ICs have all their major components integrated (like program ROM, limited RAM, UARTs, and ADC) and sell for about $1-$2. This business is moving to C language as the 32-bit, 128Kbyte memory, 50MHz microcontrollers like the ARM fall below the $5 price.

        But constructing code out of instruction sets one byte at a time is still done for very low-end devices like the Atmel Tiny11 that sells for about $0.30 each. At this price, they can replace 555 timers and TTL gates in updates of classic 1970's and 1980s electronic designs.