Slashdot Mirror


Simpler "Hello World" Demonstrated In C

An anonymous reader writes "Wondering where all that bloat comes from, causing even the classic 'Hello world' to weigh in at 11 KB? An MIT programmer decided to make a Linux C program so simple, she could explain every byte of the assembly. She found that gcc was including libc even when you don't ask for it. The blog shows how to compile a much simpler 'Hello world,' using no libraries at all. This takes me back to the days of programming bare-metal on DOS!"

51 of 582 comments (clear)

  1. Missing the point by textstring · · Score: 5, Funny

    Interesting, but she does sort of sidestep the whole 'Hello World!' part of a hello world program.

    1. Re:Missing the point by cthugha · · Score: 5, Insightful

      Since the output is the Answer to the Ultimate Question, it necessarily incorporates or encodes every possible output of every possible program, including the string "Hello World!".

      The method for extracting the particular output desired is left as an exercise for the reader.

    2. Re:Missing the point by dzfoo · · Score: 4, Interesting

      After reading the linked article, I thought underwhelmed. Then I read the second article referenced in the summary:
              http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html

      Now, that was interesting!

      The strange thing is that the summary seems to imply that both articles are related, which they are most definitely not. The first one seems to be written by a naive noob, who just discovered a nifty trick in gcc. The second one is written by a real Wizard, who shows you how to conjure up some arcane magic to make ELF your bitch.

            -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
  2. Re:Hello World by Megaweapon · · Score: 5, Funny

    FYI, Steve Jobs came up with the idea for the "Hello World" app.

    He also holds the design patent on the touch wheel interface for it.

    --
    I'm sure "SlashdotMedia" will improve on all the wonders that Dice Holdings blessed us all with
  3. Old news is VERY OLD by deblau · · Score: 5, Informative
    --
    This post expresses my opinion, not that of my employer. And yes, IAAL.
    1. Re:Old news is VERY OLD by shird · · Score: 5, Informative

      Indeed, this is very old news, it's been done many times before. I recall reading and applying this article for Windows many years ago:
      http://msdn.microsoft.com/en-us/magazine/cc301696.aspx

      there's also: http://www.ntcore.com/files/SmallAppWiz.htm and http://www.phreedom.org/solar/code/tinype/ (again for windows) and many more.

      --
      I.O.U One Sig.
    2. Re:Old news is VERY OLD by Gamma747 · · Score: 4, Interesting

      It was uploaded to Reddit 12 hours ago; that's probably why it's just reaching Slashdot now.

  4. Nice but? by garcia · · Score: 4, Insightful

    Ok, this is wicked great in theory. Our programs have become bloated. We do have them taking up too much RAM, HD space, and CPU time. But after reading through this in-depth analysis I have to wonder if it's all worth it.

    If we're willing to leave behind all pretenses of portability, we can make our program exit without having to link with anything else. First, though, we need to know how to make a system call under Linux.

    Or I can just write it the old way, making the file size larger and not have to concern myself with portability and how to make system calls under Linux. After all that's what the whole point of this all was right?

    1. Re:Nice but? by dido · · Score: 4, Insightful

      Which is missing the point. Haven't you ever wondered what's really in that 11k of machine code, and what it actually does? We've gotten so insulated from the lower levels of our computers that we no longer really understand how they do something so basic as terminating their own execution. The article felt more to me like an expository attempt to shed light on some of the things that libc has to do for us, rather than practical advice on attempting to make our programs smaller.

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
  5. I can code that app in... by putaro · · Score: 5, Funny

    45 bytes, huh? I can do it in....

    #!/bin/sh
    exit 42

    18 bytes and it's portable across all Unices. Maybe the assembler version is faster, though?

    1. Re:I can code that app in... by Dynetrekk · · Score: 5, Insightful

      Hm, if I make a file 'hello.py' with the following content:

      print 42

      ...and say to Mac OS X "open .py files in the python interpreter" and double-click, it does the job. In 9 bytes. I guess you can get it shorter if you use a language with a shorter "print" statement / function?

      And how big is Python?

      Granted, but how big is linux, letting you run that ELF?

    2. Re:I can code that app in... by larry+bagina · · Score: 4, Funny

      White pythons are generally 5-8 inches. It's a popular misconception that blacks have larger pythons. In reality, the average black python is slightly smaller than the average white python, but there's much more size variation. Asian pythons are smaller. And then there are some unfortunate guys with a micro python thats only 1-2 inches.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

  6. Umm, but by Psychotria · · Score: 4, Insightful

    Since when does a Hello World program not actually output anything?

  7. Re:11k Is Too Big? by CapnStank · · Score: 5, Insightful

    I think you missed the point of the article.

    The author is trying to highlight that amount of bloat in modern programs is so rampant that even "Hello World" is excessively over sized for what it accomplishes. How can we as programmers expect fast, efficient, lightweight code when our compiler (even ones as popular as gcc) are bloating the program without being asked to?

  8. If it's so simple, by newcastlejon · · Score: 4, Insightful

    Why doesn't it fit in TFS?

    --
    If God forks the Universe every time you roll a die, he'd better have a damned good memory.
  9. Re:11k Is Too Big? by exasperation · · Score: 5, Insightful

    As to the point of this... we recently had a story about how computers had gotten "too big to understand".

    And here we have a program, 45 bytes long, for which every single byte has a well-explained purpose. It's getting back to the bare metal and that's what makes it interesting. =)

  10. IEFBR14 by kenh · · Score: 4, Interesting

    Mainframers have been using this most simple of all utilities for decades - literally. The Wikipedia entry on it has a good write-up about this (literal) do-nothing program. It's whole purpose is to provide a mechanisim to to exploit the various functions contained in JCL to create, delete, and otherwise manipulate datasets on mainframes.

    The wikipedia entry is here: http://en.wikipedia.org/wiki/IEFBR14

    --
    Ken
  11. Simpler "Hello World" in C? by kenh · · Score: 5, Insightful

    At the end, the code was assembler, and the compiler wasn't even called - just the linker. I can't say for sure where a C program ends and an assembler program begins, but I'm fairly certain that the last few iterations are assembler, based on the "let's do away with the compiler" suggestion.

    Also, "Hello World" programs have to, you know, actually display the message "Hello World" - this is a program that isn't written in C, and doesn't write "Hello World" - care to revisit the title of this entry?

    --
    Ken
  12. YES!!!! FINALLY by commodoresloat · · Score: 5, Funny

    Thank God we have finally crossed this hurdle. The baffling complexity of helloworld.c is no longer an obstacle to world domination.

    I think we can now finally say once and for all that 2010 will be the year of Linux on the desktop.

  13. Re:11k Is Too Big? by Anonymous Coward · · Score: 4, Informative

    The whole point was learning ELF structure and why things were they way they were. Didn't you ever wonder why a "hello world" program took over 4000 bytes on a modern computer, when in 1980 a Commodore VIC-20 managed to play games in less than 4K of available memory? This wasn't a waste of time.

  14. Re:11k Is Too Big? by gzipped_tar · · Score: 5, Insightful

    But my stupid build process that generates the bloated Hello World is much more maintainable. Now get off my lawn.

    --
    Colorless green Cthulhu waits dreaming furiously.
  15. Re:11k Is Too Big? by Simonetta · · Score: 4, Interesting

    "An 11k app is not going to make me, or my computer, say 'Good Bye World'"

      It is if your computer is a 38-cent Atmel AVR tiny 10, which only has enough space for 512 12-bit instruction words. This chip is about half the size of a sunflower seed, but is faster, and, in several ways, more powerful, than the original $5000 IBM PC from 1981.

      Get away from the idea of Gigahertz desktops and $1000 laptops and join the real computer revolution!

        For me, if it costs more that $5, it's not a computer that I take seriously. It's just a 20th-century digital processing appliance.

  16. C++ is worse by MobyDisk · · Score: 4, Insightful

    Shouldn't the linker remove unreferenced functions?

    I've had this problem with gcc for a while, with C++ code. I was writing some embedded code, and I wanted to use some simple C++. Just by adding a #include of one of the stream libraries. the executable grew by 200k, even though none of it was referenced. The C++ code in iostream is template-generated anyway, so even if the compiler wanted to include the code, it can't until I instantiate it.

    1. Re:C++ is worse by macshit · · Score: 5, Informative

      Shouldn't the linker remove unreferenced functions?

      I've had this problem with gcc for a while, with C++ code. I was writing some embedded code, and I wanted to use some simple C++. Just by adding a #include of one of the stream libraries. the executable grew by 200k, even though none of it was referenced. The C++ code in iostream is template-generated anyway, so even if the compiler wanted to include the code, it can't until I instantiate it.

      <iostream> includes references to global stream objects like std::cout, not just interface definitions, so including it's going to have larger ramifications that something like <fstream>, which just defines interfaces (and indeed, for me, including <fstream> seems to have no effect on program size, whereas including <iostream> adds about 300 bytes to a simple executable).

      --
      We live, as we dream -- alone....
  17. Re:Occams Wedge by Arker · · Score: 4, Insightful

    But it really is much simpler. The reason your 'average first year comp sci student' might find it less understandable is because they dont actually understand the bloated version either. Using a high-level language doesnt reduce complexity, quite the opposite in fact, it greatly increases actual complexity. It simply makes it easier to get something done without understanding it, and thus makes it easier to kid yourself into thinking you know what you are doing, when you dont.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
  18. Re:11k Is Too Big? by jc42 · · Score: 4, Insightful

    Yeah, but the 45-byte program doesn't say "Hello World". In fact, there's no example that I can find in TFA that outputs that message or any other. So the summary is incorrect on its face. TFA doesn't show a simpler "Hello World" program; it doesn't show any sort of "Hello World" program at all.

    I feel cheated, and tricked into reading an article that didn't do what was advertised.

    (It's not the author's fault, of course; the author didn't claim to be writing the sort of program that the summary talked about. Though I was a bit disappointed that only the first few examples were in C. The article was almost entirely about assembly-language programs. So again, I was a bit disappointed, since I was hoping to learn something about making C programs smaller. This was done only in the first example, and it was made smaller by removing its call on write() so it didn't output anything at all. I already understood that I can make programs smaller by removing all functionality. ;-)

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  19. Re:Hello World by thegrassyknowl · · Score: 5, Funny

    No Steve Jobs designed "iHello World", which is actually one byte larger than the standard hello world app, but he's litigating against everyone who creates "Hello World" since 100% of it is quite obviously a subset of "iHello World".

    --
    I drink to make other people interesting!
  20. Re:So what? by AmberBlackCat · · Score: 5, Insightful

    Maybe thinking like that is why we have to get 4 gigs of ram to run without slowing down lately. I bet every executable on the hard drive has an extra 11k that somebody thought was insignificant.

  21. Re:BTDT by tomhudson · · Score: 5, Insightful

    She found that gcc was including libc even when you don't ask for it.

    This is basic knowledge that ANYONE using c should know - that the startup library is linked to so it can find main.

    This is almost as lame as their previous slashvertisement/product_whoring - where they claimed to have gotten around the Mythical Man-Month and quadrupled output - and it turned out that neither claim was true.

    And their lame excuse, which I derided in this comment:

    Greg Price wrote:

    "what I hoped to get across in this post is that that's not true--in the right circumstances, adding people to a software project can get a lot done, even in a short time"

    As many people have pointed out, you did NOT add people to a software project. You created a dozen small, one-person projects. Your self-serving reply to all that is just one more mis-representation. Have you no shame?

    I'm sure we're not the only ones to have used embedded assembler in c programs.

  22. Not a C program by erroneus · · Score: 4, Informative

    I wasted too much time reading this one... nothing surprising about what I found in it. Step one, don't write it in C. Step two, stop linking to things that aren't needed. Step three, perform the functions contained in the library omitted manually. Step five, start cheating in the elf binary format.

    The only thing interesting about it was that the article pointed out an interesting fact -- Linux will run inappropriately formatted binaries. BAD. Linux kernel people? Are you reading this? Fix it before someone figures out how to use this in making and executing more exploits.

  23. Re:11k Is Too Big? by ucblockhead · · Score: 4, Insightful

    The fact that helloworld.c compiles to 11k has less to do with bloat than it has to do with people generally not caring about 11k. You could get rid of that 11k, but to do so, you'd have to make trade offs that either make real programs either slower or bigger, or make compilation slower. Very few people would make those trade offs in the other direction. Those that do either use special purpose compilers or (more likely) write in assembly.

    --
    The cake is a pie
  24. Re:11k Is Too Big? by Zouden · · Score: 4, Funny

    Get away from the idea of Gigahertz desktops and $1000 laptops and join the real computer revolution!

    You're right! I'm going to throw my laptop out the windows right now! Reading slashdot will be so much more fun on a computer smaller than a sunflower seed.

    --
    "A week in the lab saves an hour in the library"
  25. That's exactly it by Anonymous Coward · · Score: 4, Funny

    Guy reminds me of an old joke.

    What's the difference between a bitch and a whore?

    A whore fucks everyone. A bitch fucks everyone but you.

  26. Re:So what? by wiredlogic · · Score: 4, Insightful

    Most of the microprocessors in the world today have less than a few 10's of kilobytes of RAM. They tend to do useful things most of the time.

    --
    I am becoming gerund, destroyer of verbs.
  27. Re:11k Is Too Big? by walshy007 · · Score: 4, Interesting

    #hello world tiny program
    .equ SYSCALL, 0x80
    .equ SYS_EXIT, 1
    .equ SYS_WRITE, 4
    .equ STDOUT, 1

    .section .data
    hello:
    .ascii "hello world!\n"

    .section .text

    .globl _start
    _start:
    movb $SYS_WRITE, %al     #put write syscall in eax
    movb $STDOUT, %bl     #set stream to stdout
    movl $hello, %ecx #give address of start of buffer to print
    movb $13, %dl     #how many characters of buffer to print
    int $SYSCALL
    movb $SYS_EXIT, %al
    int $SYSCALL

    The above is a tiny hello world program i wrote myself, it's worth noting that even the resulting binary is larger than it needs to be, I wound up with a 133 byte binary by moving the text string into the ELF header via hex editor, and changing the instruction data to point to the new addresses.

    Kind of hard to get it smaller than that while keeping it in ELF format, considering the actual object code in the binary was something like 15 bytes with the data illegally in the header.

  28. Re:11k Is Too Big? by MachDelta · · Score: 4, Insightful

    TFA explains it: main() isn't the true start of the program, _start is. That resides in ctrl.o, which fires off a bunch of setup stuff before calling __libc_start_main, which in turn kicks off main(), and off your program goes.

    To put it as a car analogy: What she found is that turning the key to start doesn't just activate the starter, it also activates the airbag system, the traction control, and the radio too. And if all you want to do is start the engine to prove that it runs (ala Hello World!), then it's kind of silly to lug around all that extra "unnecessary" crap too.

    Or something like that. Sadly i'm a better mechanic than a programmer (4yrs vs 1yr), but i'm working on fixing that. :)

  29. +5, Insightful by aussersterne · · Score: 5, Insightful

    Mod parent up. This is all a semantic game about where significant portions of functionality are stored (and thus counted or not). After all, back in the "pre bloatware" days, you'd have had to manage all of the complexities of machine management and I/O yourself. The assembly would have been much larger to achieve the same effect.

    Yes, you can make the argument that Linux comes with screen I/O, a scheduler, memory management, etc. already, so that's just overhead, but as others have pointed out, you can say the same thing about bash. It comes everywhere and is just overhead.

    --
    STOP . AMERICA . NOW
  30. 29 bytes ! Beat that !!! by Anonymous Coward · · Score: 5, Interesting

    c:\ xxx>debug
    -a
    mov dx, 100
    mov cx, 000D
    mov bx, 1
    mov ah, 40
    int 21
    mov ah, 4C
    int 21
    -f 111 "Hello World!"
    -a100
    mov dx, 0111
    -r cx :001D
    -n c:\ xxx\ hello.com
    -w
    -q

    c:\ xxx>hello.com
    Hello World!

    c:\ xxx>dir hello.com
    03/18/2011 11:29 AM 29 HELLO.COM

  31. Re:11k Is Too Big? by santax · · Score: 5, Insightful

    Try programming a micro-controller and suddenly you'll be facing hardware limits that force you to favor small unreadable code over bigger more maintainable code. There is a solution for it though... comments! Lots of them :D

  32. Re:11k Is Too Big? by mirix · · Score: 4, Informative

    gcc for an AVR target doesn't make an 11k hello world, though.

    Probably because that's an application where it matters, and a modern PC it doesn't matter at all.

    --
    Sent from my PDP-11
  33. The summary headline is crap by jamrock · · Score: 4, Informative

    There are three links in the article summary. The first is to the Wikipedia entry for "Hello World"; the second is to an article about writing "Hello World" without libc; the third is to part II of the second, an examination of the ELF format and demonstrates the 45 byte program. The summary headline is rubbish. Whoever wrote it either (a); never read either article, or (b); deliberately sensationalized it by conflating the salient features of both articles, in which case they should be working for the tabloids.

  34. Did similar back in MS-DOS 2.11 by Brett+Johnson · · Score: 4, Interesting

    Back in the early 1980s, I was doing development on MS-DOS 2.11 - the first real working version of MS-DOS that resembled Xenix more than CP/M.

    I was using a combination of Lattice C and assembly language to do my day job. But I was upset about the libc bloat that Lattice C would drag into the program. Over the Christmas break, I sat down and wrote a tiny version of libc, with the 60% of the calls I actually used. Most of them were either thin wrappers on top of MS-DOS Int21 calls, assembly language implementations (the string functions), or reduced functionality (printf didn't handle strange alignments, floats or doubles), and custom startup/exit code. I also structured the library so that the linker would only link in functions that were actually used. For simple executables, I saw the on-disk file size drop from 10KB-20KB down to 400-600 bytes. Another thing that reduced on-disk file size was to create .com programs, rather than .exe programs.

    I was also writing the handful of unix commands that I couldn't do without (ls, cat, cut, paste, grep, fgrep, etc). Since I was implementing dozens of Unix commands, each statically linked to libc, it was very important to reduce the over-all size of each executable. Most of the smaller trivial commands were less than 1KB in size. I think the largest was 4KB. I also had an emacs clone* that was 36KB when compiled and linked against my tiny lib.

    For the longest time, I carried around a bootable MS-DOS 2.11 floppy, with my dozens of Unix commands, an emacs-like editor, Lattice C compiler, tiny libc, and some core MS-DOS programs. It allowed my to have my entire development environment on a floppy that I could stick in anyone's machine and make it usable.

    * We had a source license for Mince, orphaned by Mark of the Unicorn, a tiny emacs-clone that ran on CP/M, MS-DOS, and Unix. We had enhanced it significantly.

  35. Re:So what? by Yosho · · Score: 4, Insightful

    I bet every executable on the hard drive has an extra 11k that somebody thought was insignificant.

    So if you have, say, 1000 open processes, that means your computer is wasting 11 MB of RAM. Such inefficiency!

    Actually, the reason you need 4 GB of RAM is because the programs you're using are far more complex than the ones that people were using when 256 MB was top-of-the-line. You may say, "But all I need is to read e-mail and browse the web!" -- except that nowadays those tasks involve rendering GUIs with Javascript, streaming and playing HD video in realtime, and doing constant full-text indexing in the background so that you can quickly search anything for any phrase. On top of that, in the background your operating system is trying to predict what you'll do next and prefetching blocks from your hard drive into RAM so that they'll already be cached when you actually need them.

    Some of that RAM is honestly being taken up by insignificant chunks of data, but most of it really is being used.

    --
    Karma: Terrifying (mostly affected by atrocities you've committed)
  36. Re:BTDT by crossmr · · Score: 4, Informative

    What a shock this comes from Kdawson. I'm about one more kdawson article away from dumping slashdot. I can't imagine that all the people in the slashdot batcave aren't laughing at this tool.
    I sometimes wonder if he just goes out and gets completely hammered at lunch then comes back and picks a few articles.

  37. Re:BTDT by h4rr4r · · Score: 4, Interesting

    Some people like their code to run on OSes for grownups.

  38. Re:BTDT by guyminuslife · · Score: 5, Funny

    The fact that people would even still use C at all for anything anywhere ever shocks me.

    I started writing device drivers in Ruby, and have never looked back.

    In order to get Ruby to run on my system, I run it in an interpreter. The interpreter is written in Java, which is a much faster language and therefore more suitable as an interpreter.

    The JVM on my system is written in C#. I know that C# is comparable to Java in terms of efficiency, but since this is a Windows machine, I figure it's "closer to the metal."

    The implementation of the .NET framework on my computer (and the Windows operating system itself) is written in Ruby. Since I already have a Ruby interpreter on my system, this presents no problems.

    --
    I don't believe in time. It's a grand conspiracy designed to sell watches.
  39. Hey, I heard that Windows isn't the only OS... by N0Man74 · · Score: 5, Insightful

    It's too bad with all these things you "heard" that you didn't happen to hear that programs are written for environments other than Windows (or Linux, Mac OS, etc), and for devices other than PCs. It's unfortunate that you are so in the dark that you don't realize that there are entire industries that rely on devices that have tiny fractions of the memory and processor speed that you ignorantly assume that we all have access too. You probably have no idea how often you are affected by devices that run 100 times slower than the desktop PC you gave as an example, or also have 1,000 times less RAM. On some of these devices C is the most advanced language you can get short of writing a compiler or interpreter yourself.

    Sure, pissing away storage space and waving a hand at execution efficiency is fine for some circumstances, but sometimes it's a luxury you can't afford. The world of software development is far bigger than the tiny little niche of programming you've been exposed to.

    I suggest you use some "real" perspective, and reevaluate what a "real language" is.

  40. Re:11k Is Too Big? by Tim+C · · Score: 4, Funny

    My God, are you saying that people should use the right tools and techniques for the job at hand, rather than applying the same limited ones to every problem they come across?

  41. Re:BTDT by kevingolding2001 · · Score: 4, Insightful

    *sigh*

    Been there done that... on the PDP-11 in 1979.

    And did you write up a nice article for other people to learn from what you had done?

    I think the real value here is not that she did this, but that she wrote it up in a nice easy to read way so that you can follow her train of thought and get a feel for how one goes about tinkering with compilers and such.

    This adds value for people like me who are not as smart as you. I could never have done this on a PDP-11 (although I did have access to one back in my days at university). I also previously would not have know enough to do this in Linux. But having read this article I feel I have learnt something and have a new insight into how linkers and libraries work. Who knows, maybe I will be able to do something similar myself after this learning experience, and for that I am grateful to Jessica for doing it, writing about it and (I'm guessing it was her) submitting it to /.

    Now I shall respectfully step off your lawn.

  42. Re:BTDT by dzfoo · · Score: 4, Funny

    I have a suggestion: If you write your JVM in Visual Basic instead of C#, it'll be portable, since most old microcomputers included BASIC in ROM. And, of course, .NET already brings Visual Basic.NET!

          -dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
  43. Re:BTDT by jlehtira · · Score: 5, Insightful

    She found that gcc was including libc even when you don't ask for it.

    This is basic knowledge that ANYONE using c should know - that the startup library is linked to so it can find main.

    Okay, and where am I supposed to learn it from? That was new to me, after using gcc for a very long time.

    I'm actually very happy that someone out there told me something that you think I should just know.

    So it wasn't new to you? Don't read it.