Slashdot Mirror


Any "Pretty" Code Out There?

andhow writes "Practically any time I hear a large software system discussed I hear "X is a #%@!in mess," or "Y is unmanageable and really should be rewritten." Some of this I know is just fresh programmers seeing their first big hunk o' code and having the natural reaction. In other cases I've heard it from main developers, so I'll take their word for it. Over time, it paints a bleak picture, and I'd be really like to know of a counterexample. Getting to know a piece of software well enough to ascertain its quality takes a long time, so I submit to the experience of the readership: what projects have you worked on which you felt had admirable code, both high-level architecture and in-the-trenches implementation? In particular I am interested in large user applications using modern C++ libraries and techniques like exception handling and RAII."

25 of 658 comments (clear)

  1. Amarok? by HappySmileMan · · Score: 3, Informative

    I'm just a 15 year old with a basic knowledge of C++, I've cracked open some source packages to test how much I know from time to time and Amarok seemed fairly well done to me, though that is of course compared to other packages, I still hyad to do a little bit of searching around to understand it.

    Also the Last.fm player seems fairly well done, though for both these programs I didn't look through the full code or change anything, so maybe I just happened to stumble across the only 2-3 human-readable source files?

  2. BOOST by alyosha1 · · Score: 3, Informative

    The boost libraries tend to be a pleasure to work with. BOOST::Python especially continues to surprise me by how much it 'just works'. That said, I haven't had much need to look at the source code itself, but there seems to be a strong desire in the boost community to do things in as clean a way as possible.

  3. LLVM is a pretty C++ application by Anonymous Coward · · Score: 1, Informative

    Check out the LLVM project, a compiler infrastructure package written in C++. It's a pleasure to hack on LLVM, but a nasty chore to hack on GCC.

  4. The linux kernel by A+beautiful+mind · · Score: 2, Informative

    It's code is pretty good. The quality and formatting standards are pretty high for the kernel, which shows in the research about bugs/line ratios too.

    --
    It takes a man to suffer ignorance and smile
    Be yourself no matter what they say
  5. nojoke by Anonymous Coward · · Score: 1, Informative

    emacs source. a work of art.

  6. OpenBSD by Anonymous Coward · · Score: 1, Informative

    wins hands down. Contrary to popular belief, it's not about security, it's about quality. Security ensues.

  7. OpenSolaris by jlarocco · · Score: 5, Informative

    As large and old as it is, OpenSolaris has fairly readable code. Plus, most of it has comments explaining why it's done the way it is.

  8. damn good by r00t · · Score: 2, Informative

    Some parts are NOT for newbie wimps, but the complex parts are well-justified. Most of the core code ("kernel" directory) is very clean and readable.

    There are useful well-written abstractions, without the typical obfuscating layers of abstraction fluff.

    The code is written to run fast, while still being portable and readable.

    Static checking is all over, but not in-your-face annoying. Some of it involves compile-time assertions. Some of it involves a lint-like tool called "sparse" which makes sure that people don't do things like random math operations on bitmasks and wrong-endian data. Sparse also stops accidental (unsafe) use of user pointers from the kernel.

  9. Re:Hello World by MillionthMonkey · · Score: 5, Informative

    Not so secure when the company is sued for stealing source code. He took credit (with his copywright notice) for a very old joke. A blatent copy-and-paste. One has to wonder how much of that he does on the job.

    Ha ha, joke's on you, you dick- that "old joke" was written by me five years ago as part of a larger post and I was not at work- in fact it was way after hours and I was about to go home. I just started with the base concrete implementation and this is what it looked like after a few minutes of stuffing patterns into it- Singleton, Factory, and Strategy. I keep thinking one of these days I'll release a 2.0 version with Proxy and Bridge. Since I was the original author I retain the right to paste it wherever I want and to attach any license agreement I feel like attaching.

    This has become the most famous code I've ever written which is the sort of thing that makes you reflect on your career. So far it has netted me about 20-30 karma points over the years (lord knows how much karma was gotten from pirated copies). I found it being examined in some software engineering papers and it even made its way into one of the patterns books (as an example of "Patterns Happy" code). When I found out about that, I made the guy send me a free copy and acknowledge me in print so I can maybe net some jobs unnecessarily screwing up simple code with GoF patterns which always pays well. Now that I released it under the terms of the Apache license he might come back for his book.

  10. Re:Firefox by Anonymous Coward · · Score: 3, Informative

    pids are sequentially assigned.

  11. filling up the symbol table by r00t · · Score: 2, Informative

    I can well imagine that a linker would choke on Boost.

    For those with a Linux/BSD/Solaris system, try running the "nm" command against a solidly Boost-infected project. You're likely to find function names that are THOUSANDS of characters long.

    Think about what that means for program start-up, at least if you call into a library. The runtime linker has to chew through all that gunk. I've run a profiler on this kind of code, and sure enough the start-up time was dominated by looking up all those giant symbols.

  12. Re:BOOST::Python, but you haven't seen the source? by niteice · · Score: 2, Informative

    However, the thread issues are with Python, not Boost. There's a more detailed description in the Python docs, but basically the Python interpreter isn't designed to run more than once in the same process.

    --
    ROMANES EUNT DOMUS
  13. LLVM by Powder · · Score: 2, Informative

    http://llvm.org/ is one of the better C++ projects I've seen. Quite large, but also clean and tidy.

  14. Re:Comments lie. Code never lies. by cburley · · Score: 4, Informative

    I'll comment (inline) as someone who has come to appreciate certain of qmail's strengths even while tolerating (to varying degrees) its weaknesses:

    I thought it was hideous. From memory (it has been awhile):
    • Hard coded file and folder names (it must be in exactly one location, too bad if you have a need for two outgoing SMTP servers running on the same box with different configurations)
    That's annoying, but basically a security feature — you can be reasonably assured that a given qmail executable, especially qmail-queue (which is the only setuid-root program in qmail), is hard-coded to operate on only certain directories (aka folders) and files. And it's not "too bad" if you need to run a second SMTP server; just configure, build, and install as many distinct qmails as you need, with the configuration files (such as conf-qmail, normally /var/qmail) set as you want them. But I think it could be more flexible without sacrificing security assurances.

    • Strange homegrown replacement for the standard C library
    I gather djb's perception of the situation (at the time he wrote qmail and related software) was such that he'd substitute "Secure" for "Strange" above, but I don't personally know of exploitable bugs in contemporary C libraries, so I can't vouch for that. However, exploitable bugs in C code that uses standard C libraries are well-known, which is another reason I believe he grew his own C library.

    • Memory deallocation done by exiting the program
    Definite win for security and speed, if you don't have memory-leak problems as a result (and I don't think any qmail component does, modulo known issues with requiring per-process VM limits on Internet-facing components such as qmail-smtpd). As soon as your program starts down the path of calling free(), or, hey, even malloc(), if it can reasonably avoid them, it gets much more complicated and bug-prone, something you don't want in a system as crucial to have working correctly with no exploitable bugs as an email server.

    • Odd preprocessor "template" functions
    Haven't studied this enough to quite "get" what he was trying to accomplish vs. other approaches that could have been used, but they are doggone annoying to deal with at times.

    • A seeming hatred of descriptive variable or function names
    Agreed.

    I don't have fond memories of the experience.

    qmail code is pretty ugly when looked at closely enough, and can seem unnecessarily "different" from a more-distant perspective.

    However, pull back far enough and look at it, and you might be able to appreciate that it is, in its own way, a work of art: a reliable, secure, powerful email system — just as pretty much any sufficiently large and beautiful work of art can look pretty flawed when scrutinized closely, especially without an awareness of the "big picture".

    So if I wanted to play around with an email server and make it do all sorts of slick stuff, I wouldn't pick qmail.

    But if I wanted to improve a mail server in some fashion while still being reasonably assured the resulting (modified) system wouldn't have remotely exploitable bugs in it, based on what I know right now, I wouldn't pick anything but qmail.

    --
    Practice random senselessness and act kind of beautiful.
  15. OpenBSD by Anonymous Coward · · Score: 1, Informative

    Most (if not all) OpenBSD code I have dealt with gives me a warm tingly feeling inside. It isn't C++, but they have their reasons.

  16. A chapter in "Beautiful Code" is on this topic... by kfogel · · Score: 2, Informative

    Laura Wingerd and Christopher Seiwald wrote an excellent chapter on this topic for O'Reilly's Beautiful Code book (just out). See Chapter 32, "Code in Motion". The code from their chapter is online here: http://www.perforce.com/beautifulcode/

    --
    http://www.red-bean.com/kfogel
  17. Re:Hello World (Newer Version) by tOaOMiB · · Score: 2, Informative

    You appear to be missing the simple if and else keywords! How do you miss an if-then statement?

  18. Re:Nice three things ya got there. by Bryan+Ischo · · Score: 2, Informative

    I consider myself to be an "expert C++ developer" and I agree with the GP's comments on Boost.  Your "my truths are self-evident except to the lazy" arguments do not convince me.

    I have not seen anything that template metaprogramming can do that can't be done using other "saner" (in my opinion) techniques.  Perhaps the template metaprogramming approach can at times produce *terser* solutions, but I don't think that they are any better than more verbose non-template-metaprogramming-based solutions.

    Terse solutions are in my opinion often *less valuable* than more verbose solutions, because the latter are generally more approachable.  Of course verbosity can be taken too far, and there's definitely a balance to be struck.

    Here is an example of template metaprogramming code straight from Wikipedia:

    template <int N>
    struct Factorial
    {
        enum { value = N * Factorial<N - 1>::value };
    };

    template <>
    struct Factorial<0>
    {
        enum { value = 1 };
    };

    // Factorial<4>::value == 24
    // Factorial<0>::value == 1
    void foo()
    {
        int x = Factorial<4>::value; // == 24
        int y = Factorial<0>::value; // == 1
    }

    (this code defines a template which forces the compiler to compute factorials instead of computing them at runtime)

    I don't see anything about this that is in any way more readable or maintainable than the non-template-metaprogramming solution also posted on Wikipedia:

    int factorial(int n)
    {
        if (n == 0)
           return 1;
        return n * factorial(n - 1);
    }

    void foo()
    {
        int x = factorial(4); // == (4 * 3 * 2 * 1) == 24
        int y = factorial(0); // == 0! == 1
    }

    (interestingly, this exact question was given to me recently on an interview programming test (write C++ code to compute factorials at compile time instead of runtime); it was the only question on the test that I couldn't answer; I said basically "I know the solution has something to do with templates and partial template specialization but I don't know the syntax of partial template specialization well enough to write this" - I'm still waiting to hear if I got the job, and hoping that not knowing partial template specialization very well won't be the deciding factor, but to be honest, if this is the kind of stuff they're doing, I don't really want to work there anyway)

    I find the recursive implementation much easier to read and understand.

    And this is a simple example.  Really hairy template stuff like this:

    BOOST_STATIC_ASSERT((
        boost::is_same<
             twice<add_pointer_f, int>::type
           , int**
        >::value
    ));

    (taken from some Boost template metaprogramming documentation)

    ... that stuff is completely unreadable and I would never want code like that in any project I worked on.

    Note that I have absolutely no problem with people using solutions like this in their code, I would never try to limit someone from solving a problem in a way that was best for them.  The thing that bugs me, is that Boost and its techniques look like they are going to become "standard" C++, which means that anyone who writes C++ code in the future is going to have to deal with this stuff.  I would like template metaprogramming much more if it wasn't something looming on the horizon that the C++ standard and common usage of C++ is eventually going to force upon me.

  19. Re:good source by C.A.+Nony+Mouse · · Score: 2, Informative

    I'm not surprised. I've spent much time (long ago) reading and modifying code of university CAD tools. Magic (also by Ousterhout) was by far the most readable, and very easy to find your way in considering it was something like half a million lines of C, IIRC. In fact, I changed my own coding style as a result.

    --
    J
  20. Random PIDs by gtwilliams · · Score: 2, Informative

    Are there any operating systems out there that use random numbering of PIDs?
    Yes. AIX.
    --
    Garry Williams
  21. Re:Firefox by TheRaven64 · · Score: 4, Informative

    Are there any operating systems out there that use random numbering of PIDs? OpenBSD randomly numbers PIDs. Malloc and mmap on OpenBSD map pages into a random part of the process's address space too. A lot of work has been done to ensure that an attacker knows as little as possible about a program that they manage to compromise as possible.

    For low-grade random numbers use something like /dev/urandom (on UNIX) instead. For high-grade random numbers, use /dev/random and note it may take a while to build the entropy. By 'UNIX' you mean 'Linux.' Other *NIX platforms do not always provide two entropy sources. On OpenBSD, /dev/random is a hardware random number generator, srandom is the strong (blocking if not enough entropy is available) random number generator, urandom is the one which transparently degrades the randomness if entropy is not available, prandom is a simple psuedo-random number generator and arandom is a device for producing seeds for an ARC4 random number generator. On FreeBSD, there is just /dev/random, which is the Yarrow generator seeded periodically from the various entropy sources.

    Don't ever hard-code /dev/* into your program unless it's one of the devices specified by POSIX. Last time I checked, this limited it to /dev/null, /dev/console and /dev/tty.

    --
    I am TheRaven on Soylent News
  22. pretty driver code... by martinde · · Score: 3, Informative

    I've always been impressed by the BusLogic SCSI driver code in the Linux kernel. Anyone interested in what a good low-level, bit banging C program should look like should study its code carefully. Here is a randomly chosen snippet: /*
            The Modify I/O Address command does not cause a Command Complete Interrupt.
        */
        if (OperationCode == BusLogic_ModifyIOAddress)
            {
                StatusRegister.All = BusLogic_ReadStatusRegister(HostAdapter);
                if (StatusRegister.Bits.CommandInvalid)
                    {
                        BusLogic_CommandFailureReason = "Modify I/O Address Invalid";
                        Result = -1;
                        goto Done;
                    }
                if (BusLogic_GlobalOptions.TraceConfiguration)
                    BusLogic_Notice("BusLogic_Command(%02X) Status = %02X: "
                                                    "(Modify I/O Address)\n", HostAdapter,
                                                    OperationCode, StatusRegister.All);
                Result = 0;
                goto Done;
            } /*
            Select an appropriate timeout value for awaiting command completion.
        */
        switch (OperationCode)
            {
            case BusLogic_InquireInstalledDevicesID0to7:
            case BusLogic_InquireInstalledDevicesID8to15:
            case BusLogic_InquireTargetDevices: /* Approximately 60 seconds. */
                TimeoutCounter = 60*10000;
                break;
            default: /* Approximately 1 second. */
                TimeoutCounter = 10000;
                break;
            }

    This is some seriously low-level stuff, and it reads like English text. It totally changed my ideas about what this kind of code should look like! It believe it was written by the late Leonard Zubkoff.

  23. Re:The MacPaint code was donated... by Stinking+Pig · · Score: 2, Informative

    What about me? I'm not important and I hardly ever post here any more, so... oh, wait.

    And to apologize for that lame joke, here's some research:
    The Computer History Museum has transcripts and interviews, but no source: http://search.computerhistory.org/search?q=macpain t&submit.x=0&submit.y=0&site=chm_collection&client =chm_collection&proxystylesheet=chm_collection&out put=xml_no_dtd
    Another interview with the original reference to putting MacPaint in there: http://www.pbs.org/cringely/nerdtv/transcripts/001 .html

    I agree that code written to the Tcl/Tk style guide is clean, though messy Tcl exists. Sort of like Perl, the language everyone loves to blame for their sophomore code.

    --
    "Nothing was broken, and it's been fixed." -- Jon Carroll
  24. Golden Code by SoopahMan · · Score: 2, Informative

    What you're asking for is often called Golden Code or Golden Pages and usually exists within large software engineering companies. The problem with gaining access to such things is that they usually are considered very important to the organization who owns them, so they are not made public - they're more or less considered trade secrets, a guide to that particular company's proprietarily developed best practices.

    The other problem with easy access to Golden Code is that it must be constantly maintained to remain... "golden." So even if someone were to post a great example online, they're probably not getting paid to do so, so it's probably going to lose its luster in a couple years. Companies who maintain Golden Code usually assign a particular product to be coded in a "golden" way and continuously maintained in that perfect state as an example to all. This requires a lot of money.

    So the point is, if you want access to Golden Code, get hired at a big software company. There are a fair number of them out there if you look outside the most obvious markets. Enjoy.

  25. OpenVPN, glib by cduffy · · Score: 2, Informative

    OpenVPN is very well-written C -- clean and accessible. Likewise for glib (not glibc, glib), presuming one likes the fun it does with macros.