Slashdot Mirror


Smallest Possible ELF Executable?

taviso writes "I recently stumbled across this paper (google cache), where the author investigates the smallest possible ELF executable on linux, some interesting stuff, and well worth a read. The author concludes, 'every single byte in this executable file can be accounted for and justified. How many executables have you created lately that you can say that about?'

27 of 451 comments (clear)

  1. Glibc is the thief! by Goodbyte · · Score: 2, Informative

    I've always wondered what all the glibc overhead (compared to f.i. uClibc) does. I've never noticed any functional difference when setting up a initrd image by using uClibc instead of glibc.

  2. Optimized Executables by aking137 · · Score: 5, Informative

    I think there are quite a few. It's seen as a challenge, and does have practical uses. Have a look at Toms Rootboot disk - it includes a web server, a telnet server, a telnet client, an nfs client, wget, gzip, bzip2, vi, a whole load of network drivers, and a tonne of other stuff, all compressed down onto one floppy disk. Only I've never quite been able to find the source code for any of it despite spending a small amount of time looking - possibly someone would be able to put me right on that one.

    There are also lots of interesting articles on linuxassembly.org.

    Andrew

    1. Re:Optimized Executables by op00to · · Score: 2, Informative

      The problem is that the FILESYSTEM itself cannot allocate a smaller amount of bytes than, say, n. So, if you have a program that is n-1, it still takes up n bytes. So, it's really not all that practical in 99.999999% of uses (that includes boot disks.)

    2. Re:Optimized Executables by Sparr0 · · Score: 5, Informative

      You are incorrect. The filesystem is stored on the disk in a compressed form, it is decompressed to a RAM drive. Every byte DOES count.

    3. Re:Optimized Executables by Fastolfe · · Score: 5, Informative

      Your information is dated. There are smarter filesystems nowadays that can allocate data from more than one file into a single "cluster". ReiserFS is one such filesystem for Linux, but there are surely others.

    4. Re:Optimized Executables by Anonymous Coward · · Score: 2, Informative

      Ah, yes, I remmember jet. Jet was a fun flight simulator when graphics were at their most primitive. The program was copy protected, so it wasn't easy to just copy the 360k floppy. The 76k you are refering to was just the loader, the full program resided hidden on the rest of the disk that DOS could not see. I had a copy a while back but alas, finally decided to format over it.

      Anyone know where a site is with that program?

  3. Bigger is Sometimes Better by ksw2 · · Score: 5, Informative
    I used to kill myself trying to strip a few lines of code from my programs... in my mind, I was trying to emulate the PDP hackers of the 60s (my heros) by finding the one "Right Thing" for each program.

    Soon I realized that smaller programs are not the end-all goal of programming. If a slightly bigger program is easier to understand for the next person who modifies/maintains it, then that is the new "Right Thing" for that application... and I realized the efficient progamming of the PDP days was a biproduct of necessity more than anything else. It's seldom needed with today's blazing hardware capabilities.

    This isn't to say that many of today's programs are over-bloated, but just to reinforce the trade-off between small and easy to understand.

  4. Re:Even Shorter... by zsmooth · · Score: 3, Informative

    No, I don't think you can make it any shorter even by removing that call. The program is 45 bytes, and the 45th byte is required to be there (a critical part of the ELF header), or else it won't execute at all.

  5. Re:umm.... yeah? by Anonymous Coward · · Score: 1, Informative

    Did you bother to read it?

    That still included the C stdlib.

    From the article, after removing that:

    "Now that's tiny! Almost a fourth the size of the previous version!"

  6. nasm by elykyllek · · Score: 3, Informative

    The nasm assembly compiler site that he mentions in the article seems /.'d, theres a sourgeforge project site instead.

  7. Re:No need to be smaller than 512 really... by Russ+Steffen · · Score: 3, Informative

    Unless, of course, you're using ReiserFS with tail packing turned on.

  8. Re:4K Demos by Ektanoor · · Score: 3, Informative

    The Demo scene had always beat the usual coders. Not long ago we had a national festival with guys coming all over from Russia. Some demos, mainly Amiga and Spectrum, were impressive. Some 3D effects were shown on machines that lack any types of acceleration. And these things ran nearly with the same speeds we frequently saw in some powerful Pentiums. Besides, the PC demo presented things shrinked to the impossible with a speed, sound, space and color effect that beated many popular games.

    I wonder the speed and the effects some Doom III would have if it was written mainly in Asm...

  9. Re:Smallest possible size by kscguru · · Score: 2, Informative
    No, but RAM has pages. 4K on x86, as I recall. You can't pull anything smaller out of the kernel - though you could pack it into another process's address space (but that defeats the purpose of a small executable anyway).

    And no, you can't change the page table size, it's hardware-dependent. Most of the other archs seem to have similar or larger pages, too.

    Why do I know this? It's "write your own VM" month in my OS class. Next week we get to start swapping out to disk...

    --

    A witty [sig] proves nothing. --Voltaire

  10. Re:Excellent troll! by DeeKayWon · · Score: 5, Informative
    Funny. My /bin/ls (Debian unstable) is nearly 60k, yet is dynamically linked, and is even stripped.

    % ls -l /bin/ls
    -rwxr-xr-x 1 root root 59592 Oct 8 20:17 /bin/ls*

    % ldd /bin/ls
    librt.so.1 => /lib/librt.so.1 (0x40022000)
    libc.so.6 => /lib/libc.so.6 (0x40034000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x40147000)
    /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

    % file /bin/ls
    /bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), stripped
  11. Re:Turbo Pascal by cvore · · Score: 5, Informative

    A dos .com file does not have a lower limit. .COM files are without headers, so having a realy tiny .com file is not very hard ;) It sais more about the crap turbo pascal puts in the .com file.. a .com file that returns correctly can just have one byte in it: 0xc3 (RET)

  12. Re:You disgust me... by whereiswaldo · · Score: 1, Informative

    *cough* Karma whore *cough cough*

  13. Re:Turbo Pascal by Anonymous Coward · · Score: 3, Informative

    - I seem to remember making some damn small Turbo Pascal .COM files. Under 4096 bytes, IIRC.

    The Amiga E language (sort of a Pascal+C+whatever like beast) compiler stuffed a hello world program in 80 bytes or so. Pure executable, no external libraries needed.
    The author's list of self-designed languages is definitely worth a look.

  14. asmutils does a good portion of this by cgleba · · Score: 5, Informative

    http://linuxassembly.org/asmutils.html

    Check it out, download it and assemble it.
    They create the smalles set of binaries for the basic linux tools that I have found and they employ a good portion of the stuff mentioned in this paper.

    They make busybox look bloated by comparison.

    Another neat trick is to use the ld options "-Wl, gc-sections" when linking a static binary -- it tries to weed out all the unused portions of the libraries it links against.

    The last trick I usually use is to link against uClibc or dietlibc rather then glibc. Makes a noticeable difference. RedHat has been working on a program called "newlib" which is supposed to do the same thing as uClibc or dietlibc but better (for embedded stuff).

  15. Re:It doesn't save any disk space by p3d0 · · Score: 5, Informative
    Consider:
    1. If you compress these things onto a floppy, every byte counts.
    2. Some filesystems like ReiserFS use tail packing to put multiple files (or file tails) into a single block.
    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  16. Re:No need to be smaller than 512 really... by reitoei1971 · · Score: 2, Informative

    true, but a small elf still takes up less memory once loaded from disk. though with 512mb+ in todays systems, people can even afford to run more than one office Xp app.

  17. Re:What about the Visual Studio .NET compiler? by seattle2napa · · Score: 1, Informative
    Change this to not include anything, change main to "int main()", and the compile like this:

    cl ft.cpp /O2 /link /opt:ref /entry:main

    1024 bytes - the linker always pads to the next page.

    If you look at the .text section, it's 16 bytes.

  18. ELFIO by Anonymous Coward · · Score: 1, Informative
    Interested in creating of ELF files may wish to look at a Sourceforge project called ELFIO:
    • http://sourceforge.net/projects/elfio/

    A "Hello world" program generated with this API takes only 267 bytes.
    • http://elfio.sourceforge.net/c66.htm
  19. OT: Re:What about the Visual Studio .NET compiler? by WhaDaYaKnow · · Score: 5, Informative
    Yeah, look at this example for the printf implementation:
    extern "C" int __cdecl printf(const char * format, ...)
    {
    char szBuff[1024];
    int retValue;
    DWORD cbWritten;
    va_list argptr;

    va_start( argptr, format );
    retValue = wvsprintf( szBuff, format, argptr );
    va_end( argptr );

    WriteFile( GetStdHandle(STD_OUTPUT_HANDLE), szBuff, retValue,
    &cbWritten, 0 );

    return retValue;
    }
    Gosh, I wonder how come M$ has so many problems with secoority. 1024 bytes on the stack, without overrun checking. Wonderful stuff indeed.

    You may say, yeah but how often will you printf more than 1024 bytes? Exactly,- practically never. Which is why this sort of crap is not showing up in testing and DOES show up when people are trying to crack it.
  20. Several things by Sycraft-fu · · Score: 3, Informative

    1) The increase in OS requirements is partially due to the increase in OS functions. XP provides a lot more eye candy than NT, which needs more processor power to handle. You may not think it's a good idea, but most people like it.

    2) The increase in OS requirements is mainly due to an increase in software requirements as a whole. An OS is worthless if you can't run anything on it, so you need to set your requirements with software in mind. MS made this mistake with Windows 95. Yes, technically IT would run with 4MB of ram, but that wasn't enough to load anything else. XP's stated minimum isn't the actual minimum, but a practical one wheny ou account for applications.

    3) As others have mentioned, compact code comes at the price of maintainability. Sure, I can write a program in 100% assembly, and then if I'm realyl good tweak the machine code to make sure it is as efficient as possable. Now try and maintain that. This is hard enough if it's a tiny app, but if it is something large like, say, Mozilla even the orignal programmer would find matenence very difficult and anyone else would find it almost impossable.

    4) Along those line, portability requires that you code in a higher level language, and often that you make some changes that increase your code size. If you do everything in optimised assembly, well it's a one platform thing. I can gaurentee that you have to do a massive rewrite of an assembly Windows app if you want to make it run on x86 Linux just because of the API differences. If you are talking another hardware platform, then it's a total and complete rewrite.

    5) Your 64k demo thing I'm assuming is refering to the now infamous Farbrausch demos. It is simply stunning what they can get done in 64k BUT it comes at a huge price. First there is the memory usage, look at your task manager sometime when one of those is running, they use like 80MB. Because of their tiny disk usage they can to decompress to memory. Second their compatibility is horrable, their newer one FR22 works properly on my sytstem at work, but not at home, the only big difference being at home a have a geForce 4 at work I have a GeForce 3. Finally, these thigns are only made possable by the "bloated" Windows framework with things like DirectX to simplfy low level access.

    6) Most people see little point in trying to make things run well on a 386 when you can get an entire system running at over 1ghz for about $500.

  21. Wired 1995 by lonedfx · · Score: 2, Informative

    Wired 1995 Surprize coding compo :

    Write the smallest possible .com program that does the following :
    1, Input a number from the keyboard, call it N
    2, Go in mode 13 (vga), draw N 3x3 squares without the central pixel (N * 8 pixels to draw), no square should be adjascent to another.
    3, Wait for Enter
    4, Exit

    Results were

    1: Walken/Impact Studios, 48 bytes
    1: (ex aequo) Paranoia, 48 bytes
    2: KLF, 51 bytes

    For info, Walken's version was drawing the squares at different positions every time his program was ran (don't ask me how) :-)

    Our own attempt (aegis) yielded 52 bytes, but we were disqualified because we did not support the key "0" :-)

    ahh... fun...

    lone, dfx.

  22. Re:I ask out of total ignorance... by flossie · · Score: 3, Informative

    No. The 45th byte of the resultant program is required to be there as part of the ELF header (Linux won't run the program otherwise). The code which generates the value 42 occurs way before the 45th byte of the program in an unused portion of the header. In fact, the return value could be a couple of bytes longer without changing the length of the overall program.

  23. Re:Going for massively off-topic here but... by Nexx · · Score: 2, Informative

    "Ah, kamisama! Ore no atama ni ono ga arimasu yo!" is quite correct. To my ears, the use of "wa" makes the rhythm of the sentence quite lethargic. Though it may not be correct textbook Japanese (the use of the word "yo" already makes it conversational, and thus, some leeway in grammar is allowed), between peers, this usage will be quite acceptable.

    Yes, I do speak Japanese fluently :-P