Slashdot Mirror


Smallest Possible ELF Executable?

taviso writes "I recently stumbled across this paper (google cache), where the author investigates the smallest possible ELF executable on linux, some interesting stuff, and well worth a read. The author concludes, 'every single byte in this executable file can be accounted for and justified. How many executables have you created lately that you can say that about?'

50 of 451 comments (clear)

  1. Not good enough by mesocyclone · · Score: 5, Funny

    It isn't amazing until its also palindromic!

    --

    The only good weather is bad weather.

    1. Re:Not good enough by Anonymous Coward · · Score: 5, Funny

      aibohphobia-the fear of palindromes. get it?

    2. Re:Not good enough by red_dragon · · Score: 5, Funny

      Weird... I read that as "aibophobia", and thought it was the fear of electronic pets.

      --
      In Soviet Russia, Jesus asks: "What Would You Do?"
  2. smallest elf execution by Anonymous Coward · · Score: 5, Funny

    I just heard the news on slashdot -- Frodo Baggins, the smallest elf, was just executed! No other details were available.

    1. Re: smallest elf execution by Black+Parrot · · Score: 5, Funny


      > I just heard the news on slashdot -- Frodo Baggins, the smallest elf, was just executed! No other details were available.

      In my own research I have discovered that the average Hobbit executable is barely half the size of the average Elf executable.

      They're faster to run in a tight spot, too!

      --
      Sheesh, evil *and* a jerk. -- Jade
    2. Re:smallest elf execution by mbogosian · · Score: 5, Funny

      Frodo Baggins, the smallest elf, was just executed!

      Unfortunately, the article was incorrect then. Frodo is a hobbit. Furthermore, he is far from the smallest hobbit.

      However, he was executed. By two elves. By way of trampling.

      Does that mean we can assume that ELF binaries run on Hobbits?

      (Sorry, I couldn't resist.)

    3. Re:smallest elf execution by mbogosian · · Score: 5, Funny

      I'm confused. Is this a troll or not?

      No. Trolls are completely different creatures from hobbits and elves.

  3. No law on repeat articles? by ebuck · · Score: 4, Interesting

    Last time I read this on slash dot was less than a year ago. I imagine in 4 or 5 months we'll see it again.

    The article is great. It really is a good intro to refresh that assembly / understand ELF / do neat stuff. I still have the tiny assembler installed on my machine from the last go round.

    I've heard of a guy who is trying to make the world's smallest 'cat' program. I wonder how many other utilities have been similiarly "optimized"

  4. You disgust me... by tuxedo-steve · · Score: 5, Funny

    ... wanting to execute the smallest possible elf. You Americans and your bloodsports. Barbarians.

    If you guys go ahead with your cold-hearted plan to execute this elf, the Olsen twins better watch their backs next time they're in Ireland, if you catch my drift.

    --
    - SMJ - (It's not just a name: it's a bad aftertaste.)
  5. Small virus catcher (for DOS) by Fuzzums · · Score: 5, Interesting

    in assembly: RET

    All this one byte program does is terminate execution. If it's infected by a virus you'll see soon enough if the size has increased.

    ofcourse with todays macroviruses this doesn't work anymore :(

    --
    Privacy is terrorism.
    1. Re:Small virus catcher (for DOS) by RinkSpringer · · Score: 4, Insightful

      Uhrm, not really. Almost any COM file infecting virus will read the first 3 bytes and check whether it's a JMP instruction (0xEB and 0xE9 opcode). If they are not, they usually refuse to infect the file.

      Therefore, this file wouldn't be infected by like 99% of all COM infecting virii...

  6. I feel guilty by Anonymous Coward · · Score: 5, Funny

    This makes my new 100-gig hard drive seem WAY too big.

  7. Re:Turbo Pascal by compwizrd · · Score: 5, Interesting

    Some of the old-time Demo groups (and warez groups) would put very nice VGA demos in 4k as well.

  8. Smallest Posible Post by Anonymous Coward · · Score: 5, Funny
    1. Re:Smallest Posible Post by moonbender · · Score: 5, Funny

      Wow, you even saved a byte by mis-spelling "Possible" - awesome!

      --
      Switch back to Slashdot's D1 system.
  9. Bloat...now a worldwide concept! by Masque · · Score: 4, Funny

    This guy clearly doesn't get the point!

    67% of Americans are overweight. They can't account for most of the bites they use. By developing software that is just as bloated, the users feel good about themselves.

    This kind of skinny programming is very insensitive to the fatass society we Americans live in! Hopefully the U.S. Congress hears of this soon, so that they may legislate this kind of software right off the face of the earth.

    Masque, head of the Sensitive Programming Foundation*

    [*A division of Maxtor Corporation; come check out our new 320GB drives, featuring room for tomorrow's applications...today.]

  10. Windows exe by Sivar · · Score: 4, Interesting

    Cas, or StorageReview.com's forums, created a 324 bytes Windows 2000 PE executeable. It completely blew away all of mine, the smallest of which were about ~700 bytes.

    --
    Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
  11. Re:umm.... yeah? by Leonel · · Score: 4, Interesting
    Did you bother to read it?

    From the article, after the first try in asm:

    Looks like we shaved off a measly twelve bytes. So much for all the extra overhead that C automatically incurs, eh?
  12. justification by mbogosian · · Score: 5, Funny

    every single byte in this executable file can be accounted for and justified

    The author's sanity, however, cannot.

  13. Optimized Executables by aking137 · · Score: 5, Informative

    I think there are quite a few. It's seen as a challenge, and does have practical uses. Have a look at Toms Rootboot disk - it includes a web server, a telnet server, a telnet client, an nfs client, wget, gzip, bzip2, vi, a whole load of network drivers, and a tonne of other stuff, all compressed down onto one floppy disk. Only I've never quite been able to find the source code for any of it despite spending a small amount of time looking - possibly someone would be able to put me right on that one.

    There are also lots of interesting articles on linuxassembly.org.

    Andrew

    1. Re:Optimized Executables by Sparr0 · · Score: 5, Informative

      You are incorrect. The filesystem is stored on the disk in a compressed form, it is decompressed to a RAM drive. Every byte DOES count.

    2. Re:Optimized Executables by Fastolfe · · Score: 5, Informative

      Your information is dated. There are smarter filesystems nowadays that can allocate data from more than one file into a single "cluster". ReiserFS is one such filesystem for Linux, but there are surely others.

  14. Not bad... by Captain+Pedantic · · Score: 5, Interesting

    But I'd like to see them get a Breakout clone in 1K

    --

    None are more hopelessly enslaved than those who falsely believe they are free. Johann Wolfgang von Goethe.
  15. No need to be smaller than 512 really... by Anonymous Coward · · Score: 5, Interesting

    Harddrive sizes being what they are now, the smallest sector size I see is 512 bytes. If the file stored in that sector is smaller than 512, it still takes up 512 bytes. Very intersting article however.

    1. Re:No need to be smaller than 512 really... by the+way,+what're+you · · Score: 5, Funny
      Unless, of course, you're using ReiserFS with tail packing turned on.

      This should really be added to the Linux Gay Conspiracy.

      --
      example.org - powered by Linux!
  16. Bigger is Sometimes Better by ksw2 · · Score: 5, Informative
    I used to kill myself trying to strip a few lines of code from my programs... in my mind, I was trying to emulate the PDP hackers of the 60s (my heros) by finding the one "Right Thing" for each program.

    Soon I realized that smaller programs are not the end-all goal of programming. If a slightly bigger program is easier to understand for the next person who modifies/maintains it, then that is the new "Right Thing" for that application... and I realized the efficient progamming of the PDP days was a biproduct of necessity more than anything else. It's seldom needed with today's blazing hardware capabilities.

    This isn't to say that many of today's programs are over-bloated, but just to reinforce the trade-off between small and easy to understand.

  17. Microsoft. Yes, Microsoft by tomoose · · Score: 4, Insightful

    Reminds me of one of Bill Gates' first programs - Micro-Soft's 1975 Altair BASIC. Unfortuantly the page I wanted to link to has gone, but this is something from the register at the time: http://www.theregister.co.uk/content/4/18949.html

    Finally found a web archive of the page I wanted: http://web.archive.org/web/20011031094552/www.rjh. org.uk/altair/4k/index2.html

    A real pity that standards have slipped so much since then.

    (I refuse to post anonymously even though I have mentioned Microsoft in a thread about Small Exes. So there :p )

  18. Efficiancy in OS programming needed by TibbonZero · · Score: 5, Insightful

    We really need more efficiant programming in OSes today. Look at the system requirements for OSes over the past few years. It's gone crazy. Check out the requirements for NT Workstation 4.0, Windows XP Pro and Windows 2000 Pro.

    Doesn't something seem messed up? What have we really gained since 4.0 that causes 4x the memory, 3x the procecssor, and almost 15x the harddrive space? Is USB and Firewire support really that big? And have you ever tried to run XP on the min system? It doesn't work so well. I remember being able to tweak a system to run Windows 95 on a 386 with 5mb memory and a 45mb harddrive. It wasn't pretty but it could run. Today if you aren't going 1ghz+, then they want to leave you behind.

    They are just using really fast hardware as an excuse for bloating the code.
    Even Linux (redhat moreso) is guilty of this.
    Remember when awesome games could fit on a handful of floppies? I think that could fly today if they tried. Look at the Demo scene. 64k can do alot of graphics. The most awesome games like Betrayal at Krondor were only a few floppies. Sure, if you have big hardware use it, but don't waste it. Programmers are just getting slack and including (literally) everything in the world, and not writing anything for themselves. They aren't looking to optimize stuff, just to kick it out and make money (obviously open source isn't guilty of the money or the fast kickout thing)...

    --
    Tibbon
    tibbon.com
    1. Re:Efficiancy in OS programming needed by coupland · · Score: 5, Insightful

      Unfortunately my take on this situation is a bit more sinister. Your post mentions that NT 5.0 (Windows 2000 Pro) requires 64M of RAM, yet NT 5.1 (Windows XP) requires 128M of RAM. Why the twofold increase for a minor upgrade? Well, consider two things:

      1. Windows XP was released during one of the slowest hardware sales slumps in PC history. All the big players were hoping to see XP spur sales. Not coincidentally, for many people XP required a new PC.
      2. Microsoft can only stand to benefit from these PC sales in the form of OEM licenses.

      Yes I'm cynical but I've always been of the belief that the bloat in XP is engineered, not the simple result of bad programming. To think that the project managers and marketing don't talk about these sorts of things is naive

    2. Re:Efficiancy in OS programming needed by Fastolfe · · Score: 4, Insightful

      I think a lot of it is due to pressure to get a product out. Developers are relying exclusively nowadays on high-level languages, even in OS design, and those that write the compilers don't spend as much time on getting good, compact, precise and optimized code out of high-level code. Nobody cares. CPU is cheap, hard disk is cheap. Why should they work to make their stuff efficient when they can just claim their product is so advanced it requires twice the resources.

      Part of it also lies on the shoulders of developers. A lot of developers today are simply programmers that learned C in high school. They have little understanding of machine languages, assembly, or the CPU architectures they're coding for. They know just what the high-level languages look like and one or two ways of accomplishing their goal. What they need to know is how their software design decisions actually get implemented by the assembler and executed by the architecture. Memory efficiency never even crosses their mind. Who wants to pay for programmers that actually know their shit when they can just claim their product is so advanced it now requires four times the resources?

      Perhaps this is another area in which OpenSource software can shine some day...

  19. proccessing in today's world by eng69 · · Score: 5, Funny

    The current state of elf proccessors demands an astounding amount of system resources. When combined with dwarf co processor, it provides for unparalleled carnie access.

  20. MenuetOS by jaaron · · Score: 4, Interesting

    On a similar topic, MenuetOS is a full OS written in assembly and fits on a floppy. Yeah, lots of OS's used to fit on floppies, but it's still cool. It's amazing what all you can fit into a small space if you're careful.

    --
    Who said Freedom was Fair?
  21. On bloatnesses by Ektanoor · · Score: 4, Insightful

    Well this reminds me the golden days of DOS (not Denial Of Service but Disk Operating System... well, anyway it didn't made a difference). Back then people fought for every bit of code. And assembler was as popular as C or Pascal.

    However, using assembler this way is not the most optimal resource. Frankly this piece of code is only useful if you need some real tiny program and you are running out of space and speed. But, today, 99.99...% of tasks don't need it. The optimal way to use such tricks is to concentrate in tasks that really need "the best and fastest code ever". These are drivers and situations where speed's price costs gold. Usually this is done by injecting the necessary asm directives into C or any other language. Writing everything in pure Assembler is unpractical and the result may become harder to understand than the Rosetta Stone.

    However the article is making a point - how unoptimised are the present compilers. For example, GCC is mostly C in C. It makes it highly portable, but, if anyone decided to repeat Turbo Pascal feat (most of its base code was Assembler), I know that binary code would shrink to the impossible. Right now we may not be feeling this drawback as bloatness still doesn't clog everything. In the future this situation may change if speed and reliability turn to higher priorities.

    Some note for the bloat FUDders: This is not a reason for Linux distros being bloat. First learn to be rational on your needs and don't install everything in one box. Second, learn a little bit of administration, maybe some programming and kick that (mega_kernel) + (some_highly_featured_libs) + (several_unuseful_apps) out of your box. Then you will know that Linux can help fry eggs on your processor with lightning speed. Till then, keep the flame for yourself and read "Why I switched from Mac to Windows".

  22. People are missing the point by Kenneth+Stephen · · Score: 5, Insightful

    Looking through the comments here I see two main threads : (1) Squeezing out the last few overhead in a program leads to hard to understand / maintain program and thus is not worth the effort. (2) Whats the big deal anyway in this era of 100 GB disks and 2GHz processors?

    While both these criticisms are valid, they miss the point. Firstly, it wasnt the objective of the author to squeeze the last few bytes out of that program to save resources. He was just putting his hard-earned knowledge to use. He was doing it because he could! This is the same motivation for people who climb mountains : because the mountain is there, and because they can climb it. Indeed, if the author were seriously looking into saving resources, he'd hardly be wasting his time on a trivial program, would he?

    Secondly, one of the authors intentions was to demonstrate the limits to which austerity could be taken to. Certainly, this was a trivial program - but the same principles could be used to shrink larger non-trivial programs, and it those cases, the savings could possibly be larger. Of course, it those cases, the largest savings would come from a good optimizing compiler rather than crunching the headers together. More importantly, the author has exposed whole new ideas and lines of possibilities to programmers.

    --

    There is no such thing as luck. Luck is nothing but an absence of bad luck.

  23. Excellent troll! by PurpleFloyd · · Score: 4, Insightful
    Linux software is horribly bloated, like even "ls" is above 30k
    ls is probably statically linked (all necessary libs reside within the executable), so it will function in almost any circumstance where the executable itself is not corrupted. Would you really want to try to repair a broken system without ls? Most critical utilites and shells are available in statically-linked forms (if not, you can do it yourself). While executable size is an important consideration, it isn't the only one. I would rather have a set of basic programs (like ls) that work even if all the lib directories are toasted, than to save a few K here and there, and have a system that could never pull itself back up if broken.
    --

    That's it. I'm no longer part of Team Sanity.
    1. Re:Excellent troll! by DeeKayWon · · Score: 5, Informative
      Funny. My /bin/ls (Debian unstable) is nearly 60k, yet is dynamically linked, and is even stripped.

      % ls -l /bin/ls
      -rwxr-xr-x 1 root root 59592 Oct 8 20:17 /bin/ls*

      % ldd /bin/ls
      librt.so.1 => /lib/librt.so.1 (0x40022000)
      libc.so.6 => /lib/libc.so.6 (0x40034000)
      libpthread.so.0 => /lib/libpthread.so.0 (0x40147000)
      /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

      % file /bin/ls
      /bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), stripped
  24. It doesn't save any disk space by Gerry+Gleason · · Score: 5, Insightful
    Below the filesystem size quantum, you don't save anything anymore. You can't allocate less than a page of memory either.

    Beyond some point, the article is really just silliness, interesting or not. Below 512 bytes, your not going to save anything on any system. Ok, there are filesystems that compress things further for squeezing into flash memory and such, so maybe there are some marginally useful applications, but still the header overlapping is a bit much to be worth considering.

    1. Re:It doesn't save any disk space by p3d0 · · Score: 5, Informative
      Consider:
      1. If you compress these things onto a floppy, every byte counts.
      2. Some filesystems like ReiserFS use tail packing to put multiple files (or file tails) into a single block.
      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  25. 4K Demos by Wraithlyn · · Score: 5, Interesting

    Some of the 4K demos I've seen written for ASM competitions completely blow my mind... check out this one, it's basically a flythrough of the first level of Descent, with texture mapping, source lighting, animated lava and recharger field, a MIDI soundtrack, etc... all in 4095 bytes!!!

    Here is Sanction's home page, it contains a couple more very impressive 4K demos.

    --
    "Mind, as manifested by the capacity to make choices, is to some extent present in every electron." -Freeman Dyson
  26. Re:That goes to show those C bigots by Waffle+Iron · · Score: 5, Insightful
    Hand-tuned assembler is always faster/smaller/better than C code, except when it comes to portability.

    Maybe in theory. In practice, once your program gets too big to fit it all in your head at once, you're going to run out of the mental energy required to stay ahead of the C compiler (and remain bug-free).

    If you've disassembled the output of a good optimizing compiler lately, you'd see that it usually produces pretty good code. Except for the inner loops of numerical algorithms, I doubt that anyone will consistently be able to produce code that is more than 25% faster than the C compiler.

    The thing is, the compiler is able to spit out this code at thousands of lines per minute all day long. It doesn't get tired. The human programmer is going to get tired of the boredom, and will start creating higher level abstractions in assembly. He'll start using macros. He'll use a simplified parameter passing protocol so that he doesn't have to inline and hand-allocate the registers for every little subroutine call.

    Before long, he's fallen behind, and the C code will run faster overall. And the C program will have taken less time to write, as well.

  27. What about the Visual Studio .NET compiler? by Jugalator · · Score: 5, Interesting
    Stand-alone console EXE (Release Build):
    #include "stdafx.h"

    int _tmain(int argc, _TCHAR* argv[])
    {
    return 42;
    }
    Size: 20,992 bytes

    To be compared with the non-optimized gcc version at 3,998 bytes. :-)

    I wonder how small you can make a Windows EXE..
    --
    Beware: In C++, your friends can see your privates!
    1. Re:What about the Visual Studio .NET compiler? by Jugalator · · Score: 5, Interesting

      Hm... I stripped the code from stdio.h, replaced _TCHAR* with char* so the stdafx.h doesn't really do much at all. Then turned on size optimizations and turned off boundary checks etc in the compiler. Still exactly 20992 bytes. Huh?? Browsed the exe and there's full text messages like "ooh something corrupted the state of this program and it cannot safely continue". Which is actually a great addition by Microsoft, but can't you remove such things? :-)

      But I guess the .NET compiler has its lower limits where bloat get called feature, just surprising that it seems to compile at the minimum size by default... Or perhaps it use some kind of silly padding so even if there's less code, the physical size isn't reduced.

      --
      Beware: In C++, your friends can see your privates!
  28. Re:Turbo Pascal by cvore · · Score: 5, Informative

    A dos .com file does not have a lower limit. .COM files are without headers, so having a realy tiny .com file is not very hard ;) It sais more about the crap turbo pascal puts in the .com file.. a .com file that returns correctly can just have one byte in it: 0xc3 (RET)

  29. asmutils does a good portion of this by cgleba · · Score: 5, Informative

    http://linuxassembly.org/asmutils.html

    Check it out, download it and assemble it.
    They create the smalles set of binaries for the basic linux tools that I have found and they employ a good portion of the stuff mentioned in this paper.

    They make busybox look bloated by comparison.

    Another neat trick is to use the ld options "-Wl, gc-sections" when linking a static binary -- it tries to weed out all the unused portions of the libraries it links against.

    The last trick I usually use is to link against uClibc or dietlibc rather then glibc. Makes a noticeable difference. RedHat has been working on a program called "newlib" which is supposed to do the same thing as uClibc or dietlibc but better (for embedded stuff).

  30. Small size != More efficient by yorgasor · · Score: 5, Insightful

    Just because a program or executable file is smaller, doesn't necessarily mean it's more efficient. For instance, some compiler optimizations actually produce larger executables. If you unroll a loop, it actually generates code for each iteration of the loop, but saves time because it's faster to keep going forward than to branch backwards to run through the code again.

    Similarly, you can have inline functions that insert the inline function directly into the function calling it. Every function that calls an inline function would get a copy of it, which produces larger code, but saves a lot of time since it doesn't need to push the arguments on the stack, branch to the new function, and return with the value.

    Finally, the biggest speed gains you can get are generally algorithmic in nature. You can do a bubble sort with just a few lines of code. It's a lot simpler code and smaller than the larger and more complicated quick sort or merge sort. I know which one I'd rather wait for with a million items to sort.

    So remember, just because something is bigger, doesn't mean it's more bloated, and just because something is smaller doesn't mean it's faster or more efficient.

    --
    Looking for a computer support specialist for your small business? Check out
  31. Kids don't try this at home :-) by Antity · · Score: 5, Interesting

    The first few examples are quite noteworthy, but when the author starts to put code inside the ELF header, it gets really ugly..

    Saying that these bytes are "only padding anyway for future extensions" doesn't feel that good. :-)

    This remembers me of early attempts on AmigaOS to shorten and fasten executables where people could be sure that all available Amigas would only use the lower 24 bits of 32 bit address registers since the machines could only address 24 bits physically. So they put application data into the upper 8 bits of registers. Worked fine.

    Then came newer machines which really used the full set of 32 address lines and all those dirty programs crashed without obvious reason..

    The author says "if we leave compatibility behind.." but what he's doing is not only leaving inter-OS compatibility behind - what he creates isn't even an ELF executable anymore. It's just something that happens to work with this special Linux version.

    So since this isn't even an ELF executable any more, there's no reason not just to write "exit 42" in bash (which would be an amazing 8 bytes in size *g*).

    Don't misunderstand me, I really like those hacks. But I myself will never, ever again code something that is prone to break in the future just because I didn't follow standards.

    One could say that this is what programming is about. :-) No offence meant.

    --
    42. Easy. What is 32 + 8 + 2?
  32. 26 bytes by bcrowell · · Score: 4, Funny
    Yeah, and now here's a thumb in the eye for all those C bigots and all those assembler bigots:

    $ cat >a.pl
    #!/usr/bin/perl
    exit(42);

    $ chmod +x a.pl
    $ ./a.pl
    $ echo $?
    42
    $ ls -l a.pl
    -rwxr-xr-x 1 bcrowell bcrowell 26 Oct 19 12:41 a.pl

    Only takes up 26 bytes on my hard disk!

  33. Embedded systems need efficiency by yerricde · · Score: 4, Interesting

    CPU is cheap, hard disk is cheap.

    Maybe on PCs, but not on embedded systems, handheld systems, or game consoles. The Game Boy Advance, for instance, has only 384 KB of RAM, and all but 32 KB are 16-bit bus width with muchos wait states. Many microcontrollers inside such things as microwave ovens are as powerful as an Atari 2600 VCS, with 128 bytes of RAM and about 12 bytes of VRAM (if that).

    --
    Will I retire or break 10K?
  34. Re:Interesting topic... by topham · · Score: 4, Insightful

    If you ever found a bug in code optimized to that degree you would NOT want to fix it as it could require a complete rewrite from scratch.

  35. OT: Re:What about the Visual Studio .NET compiler? by WhaDaYaKnow · · Score: 5, Informative
    Yeah, look at this example for the printf implementation:
    extern "C" int __cdecl printf(const char * format, ...)
    {
    char szBuff[1024];
    int retValue;
    DWORD cbWritten;
    va_list argptr;

    va_start( argptr, format );
    retValue = wvsprintf( szBuff, format, argptr );
    va_end( argptr );

    WriteFile( GetStdHandle(STD_OUTPUT_HANDLE), szBuff, retValue,
    &cbWritten, 0 );

    return retValue;
    }
    Gosh, I wonder how come M$ has so many problems with secoority. 1024 bytes on the stack, without overrun checking. Wonderful stuff indeed.

    You may say, yeah but how often will you printf more than 1024 bytes? Exactly,- practically never. Which is why this sort of crap is not showing up in testing and DOES show up when people are trying to crack it.