Slashdot Mirror


Don't Overlook Efficient C/C++ Cmd Line Processing

An anonymous reader writes "Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. This article provides a reference for a good discussion on how to use gperf for effective command-line processing in your C/C++ code."

13 of 219 comments (clear)

  1. Speed in options parsing? by tot · · Score: 5, Insightful

    I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.

    1. Re:Speed in options parsing? by ScrewMaster · · Score: 3, Insightful

      I'd say the speed of human motor activity is an even greater limiting factor.

      --
      The higher the technology, the sharper that two-edged sword.
    2. Re:Speed in options parsing? by canuck57 · · Score: 3, Insightful

      I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.

      Your just experiencing this with Java, Perl or some other high overhead bloated program. People often pull out a heavy weight needing a 90MB VM or a 5-10MB basis library calling the cats breakfast of shared libraries I would agree, but lets take a look at C based awk for example, it is only a 80kb draw. Runs fast, nice and general purpose and does a good job of what it was designed to do. It can be pipelined in, out and used directly on the command line as it has proper support for stdin, sndout and stderr. On my system, only 10 disk blocks to load.

      While fewer people are proficient at it, C/C++ will outlast us all for a language. Virtually every commodity computer today uses it in it's core. Many others have come and gone yet all our OSes and scripting tools rely on it. So any dooms day predictions would be premature, and if your want fast, efficient and lean code you do C/C++....

    3. Re:Speed in options parsing? by VGPowerlord · · Score: 3, Insightful

      Not everyone uses the same tab stops.

      I see that as a good reason to use tabs. Don't like how far it's indented? Change how wide your editor displays tabs.
      --
      GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
  2. Too much by bytesex · · Score: 3, Insightful

    I'm not sure that for the usually simple task of command line processing, I'd like to learn a whole new lex/yacc syntax thingy.

    --
    Religion is what happens when nature strikes and groupthink goes wrong.
    1. Re:Too much by hackstraw · · Score: 4, Insightful

      I'm not sure that for the usually simple task of command line processing, I'd like to learn a whole new lex/yacc syntax thingy.

      The syntax for gperf is not that bad, but its simply the wrong tool for the job as far as commandline processing goes.

      gperf simply makes a "perfect" has function for searching a predetermined static lookup. It provides no mechanism for arbitrary arguments like input filenames or modifiers (like a filter for including/excluding things, or increasing/decreasing something) nor does it check for conflicting options or missing options.

      gperf would give you nothing besides a match of input to a state. gperf would provide nothing for a common commandline like: --include="*.txt" --exclude="*.backup" --with-match="some text|or this text" --limit-input=5megabytes

      getopt or just rolling your own if/else if ladder or switch statement would provide much more flexibility over gperf.

      Now, with parsing a configuration file, gperf might help, but for processing commandline arguments, gperf is simply the wrong tool for the job.

      This is like the second or third slashdot posting from IBM's developer works that is simply a well formated nonsense. Past examples are http://developers.slashdot.org/article.pl?sid=07/0 4/09/1539255 and http://developers.slashdot.org/article.pl?sid=07/0 4/09/1539255

      This is silly on both slashdot and IBMs part.

  3. Yeah, because getopt(3) is a real bottleneck by V.+Mole · · Score: 4, Insightful

    Does the phrase "reinvent the wheel" strike a chord with anyone?

  4. Re:C++ I get by Anonymous Coward · · Score: 4, Insightful

    I do. On MIPS, ARM, PPC, x86, and all the other embedded stuff. I don't think C will ever die - it's the universal assembler language.

  5. And the standard says... by Anonymous Coward · · Score: 5, Insightful

    Good grief. What a strawman of an example.
    Anyone writing or maintaining command line programs knows that they
    should be using the API getopt() or getopt_long().
    There are standards on how command line options and arguments are to be
    processed. They should be followed for portability and code maintenance.

  6. Re:Joke? by iangoldby · · Score: 4, Insightful

    Someone found a "new" toy?
    Well I for one won't be using this to process command-line arguments (that's what getopt() and getopt_long() are for), but it is certainly useful to know of a tool that I can use to generate a perfect hash. The next time I need some simple but efficient code to quickly discriminate between a fixed set of strings, I'll know to Google for gperf. (Before I read this article I didn't even know it existed.)
  7. All the world is not a PC by tepples · · Score: 5, Insightful

    HOLY SHIT! 194KB BIGGER?! HOW WILL YOU EVER FIND THE SPACE FOR SUCH A HUGE EXECUTABLE?!?! I develop for a battery-powered computer with 384 KiB of RAM. In such an environment, what you appear to sarcastically call a "mere couple hundred kilobytes" is a bigger deal than it is on a personal computer manufactured in 2007.
  8. Wrong in so many ways by geophile · · Score: 4, Insightful

    Perfect hash functions are curiosities. If you have a static set of keys, then with enough work you can generate a perfect (i.e. collision-free) hash function. This has been known for many years. The applicability is highly limited, because you don't usually have a static set of keys, and because the cost of generating the perfect hash is usually not worth it.

    Gperf might be reasonable as a perfect hash generator for those incredibly rare situations when the extra work due to a hash collision is really the one thing standing between you and acceptable performance of your application.

    I thought maybe we were seeing a bad writeup, but no, it's the authors' themselves who talk about the need for high-performance command-line processing, and give the performance of processing N arguments as O(N)*[N*O(1)]. I cannot conceive of a situation in which command-line processing is a bottleneck. And their use of O() notation is wrong (they are claiming O(N**2) -- which they really don't want to do, not least because it's wrong). O() notation shows how performance grows with input size. Unless they are worrying about thousands or millions of command-line arguments, O() notation in this context is just ludicrous.

    I don't know why I'm going on at such length -- the extreme dumbness of this article just set me off.

  9. Historically? by ClosedSource · · Score: 3, Insightful

    "Command-line processing is historically one of the most ignored areas in software development."

    This is like saying that walking is historically one of the most ignored areas in human transportation.