Slashdot Mirror


Don't Overlook Efficient C/C++ Cmd Line Processing

An anonymous reader writes "Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. This article provides a reference for a good discussion on how to use gperf for effective command-line processing in your C/C++ code."

18 of 219 comments (clear)

  1. Re:Speed in options parsing? by ChronosWS · · Score: 4, Informative

    Indeed, what the hell? Now you have to have another tool and another source file for what is essentially declaring a dictionary in C++, which should be in any good developer's library? Yeesh.

    If you don't like the nasty nested ifs, make the keys in your dictionary the command line options and the values delegates, then just loop through your list of options passed on the command-line, invoking the delegate as appropriate. Eliminates the if, there are no switch statements either, and each of your command line arguments is now handled by a function dedicated to it, bringing all of the benefits of compartmentalizing your code rather than stringing it out in a huge processing function.

  2. Re:Speed in options parsing? by pete-classic · · Score: 3, Informative

    What a limited point of view. See "man system", for example.

    -Peter

  3. Broken handling of vtables in linkers by tepples · · Score: 4, Informative

    Now you have to have another tool and another source file for what is essentially declaring a dictionary in C++, which should be in any good developer's library? Due to the brokenness of how some linkers handle virtual method lookup tables, using anything from the C++ standard library tends to bring in a large chunk of dead code from the standard library. I compiled hello-iostream.cpp using MinGW and the executable was over 200 KiB after running strip, compared to the 6 KiB executable produced from hello-cstdio.cpp. Sometimes NIH syndrome produces runtime efficiency, and on a handheld system, efficiency can mean the difference between fitting your app into widely deployed hardware and having to build custom, much more expensive hardware.
  4. Equivalent Python by Anonymous Coward · · Score: 0, Informative

    import sys

    def function_1 (...):
        ...

    functions = {'a': function_1,
                 'b': function_2,
                 'c': self.method_1, ...}
    func = functions[value]

    if __name__ == '__main__':
        args = sys.argv[1:]
        func(args)

    # The variable "functions" is set to a Python dictionary.
    # Built-in dictionaries already use fast hash-table lookups.

  5. It is if the linker complains about not finding it by tepples · · Score: 4, Informative

    Yeah, because getopt(3) is a real bottleneck getopt() is in the header <unistd.h>, which is in POSIX, not ANSI. POSIX facilities are not guaranteed to be present on W*nd?ws systems. It also handles only short options, not long options. For those, you have to use getopt_long() of <getopt.h>, which isn't even in POSIX.

    Does the phrase "reinvent the wheel" strike a chord with anyone? If the wheel isn't licensed appropriately, copyright law requires you to reinvent it. Specifically, using software under the GNU Lesser General Public License in a proprietary program intended to run on a platform whose executables are ordinarily statically linked, such as a handheld or otherwise embedded system, is cumbersome.
  6. Re:C++ I get by Enselic · · Score: 4, Informative

    You are wrong about 3):

    The process of building the new engine went much more smoothly than anything we have done before, because I was able to do all the groundwork while the rest of the company worked on TeamArena. By the time they were ready to work on it, things were basically functional. I did most of the early development work with a gutted version of Quake 3, which let me write a brand new renderer without having to rewrite file access code, console code, and all the other subsystems that make up a game. After the renderer was functional and the other programmers came off of TA and Wolf, the rest of the codebase got rewritten. Especially after our move to C++, there is very little code remaining from the Q3 codebase at this point.

    Source: http://archive.gamespy.com/e32002/pc/carmack/


    And 4) as well:

    Historically, compilers for many languages, including C++ and Fortran, have been implemented as "preprocessors" which emit another high level language such as C. None of the compilers included in GCC are implemented this way; they all generate machine code directly. This sort of preprocessor should not be confused with the C preprocessor, which is an integral feature of the C, C++, Objective-C and Objective-C++ languages.

    Source: http://gcc.gnu.org/onlinedocs/gcc-4.2.1/gcc/G_002b _002b-and-GCC.html

  7. devkitARM by tepples · · Score: 3, Informative

    And you do so using MinGW and c++? Yes, I do so with devkitARM (a cross-compiling GCC toolchain that is itself compiled with MinGW) and C++.
  8. only relevent to static linking by sentientbrendan · · Score: 4, Informative

    It sounds like the author is statically linking his library and running on embedded an embedded system. It is not surprising in that case that the c++ standard library brings in much more code than the c standard library, but it should be made clear that it is not relevant to desktop developers, pretty much all of which dynamically link with glibc.

    Again, to be clear, dynamically linking with the c++ standard library is not going to increase your executable size. Please don't try to roll your own code that exists in the standard library. It is a real nuisance when people do that.

    I should qualify that by saying that template instantiations do (of course) increase executable size, but that they do so no more than if you had rolled your own.

  9. Character encoding conversion by tepples · · Score: 2, Informative

    How many of these embedded tools you write actually _do_ command line processing? None yet, but they do handle other things that involve dictionaries, such as character encoding conversion. A program designed to move items back and forth between a town in Animal Crossing (for Nintendo GameCube) and a town in Animal Crossing: Wild World (for Nintendo DS) needs to be able to understand the encodings of character names and town names that these games use, possibly by converting between their proprietary 8-bit codecs and UTF-8.

    why don't you invest in more (both memory- and time-) efficient ways to do IPC than the command line? Because the command line, pipes, and sockets are the most obvious ways for two programs to communicate if their copyright licenses prohibit them from being linked together into one executable.
  10. Re:C++ I get by mce · · Score: 3, Informative

    Of course C++ exceptions are what I meant. What else would I mean when using the word "exceptions" in this context?

    And yes, C++ exceptions can be expressed in C. After all, C is a glorified assembler and the resulting code from C++ translation is assembler as well. It all depends in the level of abstraction at which write the C code is written and on the amount of uglyness/inefficiency you're willing to take on board (and also the trade-off between both of the latter). But that's not the point. The point of this thread is that nowadays it makes no sense to make use of this capability in a C++ compiler. Especially not when considering that a user of a C++ compiler wants more than just a compiler. He also wants a debugger that is able to meaningfully link up the binary and the original C++ source. If you're a C++ compiler vendor, using C as an IL does nothing but complicate your own life. Twice.
  11. Re:And the standard says... by JNighthawk · · Score: 2, Informative

    Yes, because we should be using functions that are NOT IN THE STANDARD to maintain portability.

    Oh, and as far as I know, those functions aren't in VC++, which is what a hefty chunk of C/C++ development is done on.

    --
    Wheel in the sky keeps on turnin'.
  12. Boost.Program_Options? by nahpets77 · · Score: 2, Informative

    What about Boost.Program_Options? I thought I'd see a post on it here somewhere, but not one person has mentioned it (yet).

    A few months ago, I was looking around for a C++ library for parsing command line options. I checked out get_opt and I thought that there must be something that uses std::string instead of char*. After some googling, I found Boost.Program_Options seemed to be exactly what I was looking for. It supports long and short options (-s,--short) and I was able to start using it quite easily after looking at the tutorials.

  13. Re:C++ I get by mce · · Score: 3, Informative

    The main problem (but not the only one) is called "object destructors". You have to make sure they are called. All of them, and in the correct order, at all the nested scopes of execution you are in when the exception occurs. And you need to make sure not to call them on any object not yet constructed (always remember that constructors can throw exceptions too) and never to call a destructor twice (I've seen this kind of bug multiple times in multiple compilers). And then there is the fun of exceptions thrown by destructors, not to mention the possibility that it all happens in the middle of constructing or destructing an array of objects.

    All that is why setjmp()/longjmp(), also known as C's non-local goto, don't cut it, which in turn means that you need to complicate function return mechanisms. And just when you think you got that problem sorted out, you need to be aware that C++ functions can call (library) C functions that were never compiled to even know about exceptions but that in turn can call C++ functions that may again throw an exception. The entire construction needs to be able to handle this.

    As I wrote in an other post in this thread, it can be done. But it is not easy. Note that the entire object destructor issue also applies within a single scope, which is why life is not as easy as replacing every "throw" statement by "goto end;".

  14. Re:All the world is not a PC by siride · · Score: 2, Informative

    A kilobyte means 1024 bytes among programmers. Any programmer that doesn't know that would likely not know what a kibibyte is either.

  15. Re:It is if the linker complains about not finding by tqbf · · Score: 2, Informative

    Absolutely. There is no platform for which gperf is a better, more portable option for command line processing than getopt. I'm not sure what you think getopt does that is "tricky" under Win32. Its a string processor.

  16. Re:It is if the linker complains about not finding by __aawavt7683 · · Score: 2, Informative

    When faced with the issue of implementing getopt on Windows, I merely took the code from FreeBSD: src/lib/libc/stdlib/getopt.c

    I love FreeBSD. (I once changed the motherboard, rebooted, went, "Oh.. shit," and proceeded to login. All drivers are compiled as modules, in less time than my lean linux kernel. :-/)

    I sidestepped the license issue, stripped out extraneous header files, changed a couple referenced to _getprogname() (either to static string "" or to a global var, as it is in libc), read the man page to figure out how to use it and had a short-form option parser in.. probably under an hour.

    Some things you have to code. For everything else, the Regents of the University of California has done it for you.

  17. Re:It is if the linker complains about not finding by tqbf · · Score: 2, Informative

    Again, on the off chance that this helps anyone reading this pitifully long and silly thread: it is trivial to make getopt work on Win32, just like it was trivial to make strsep work on Linux when it only had strtok. I object to the argument that "portability" has anything whatsoever to do with whether you'd use getopt to parse arguments.

    Like most of the other comments on this post, I find the idea of using gperf for "high performance argument parsing" superfluous and convoluted. In fact, I find the idea of a general-purpose perfect hash tool a bit superfluous as well; gperf languishes in obscurity for a reason.

  18. Re:Speed in options parsing? by JesseMcDonald · · Score: 3, Informative

    Writing code that writes code--now we're thinking!

    But what could we call this code, a compiler? Nah, I think we need to think of another word for it.

    How about "macro"?

    --
    "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat