Don't Overlook Efficient C/C++ Cmd Line Processing
An anonymous reader writes "Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. This article provides a reference for a good discussion on how to use gperf for effective command-line processing in your C/C++ code."
I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.
Does the phrase "reinvent the wheel" strike a chord with anyone?
I do. On MIPS, ARM, PPC, x86, and all the other embedded stuff. I don't think C will ever die - it's the universal assembler language.
Good grief. What a strawman of an example.
Anyone writing or maintaining command line programs knows that they
should be using the API getopt() or getopt_long().
There are standards on how command line options and arguments are to be
processed. They should be followed for portability and code maintenance.
There's this little project of which you may have heard: http://www.kernel.org/
You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64).
Excuse me???? That was not even true anymore when I started using C++, back in 1992. There are features in the C++ standard that are so extremely difficult to correctly implement in standard compliant C that it's a complete waste of effort trying to pass via C while compiling. Exception handling comes to mind as the prime example. A failed attempt to support exceptions was the reason why Cfront 4.0 was abandoned. Note that 3.0 was released as early as 1991. The last Cfront based compiler I had the horor of using was HP's CC. It was superseeded by the new native aCC by 1994 at the latest.
By the way, I used to write C/C++ compilation/optimisation stuff for a living, so I guess I know something about the topic.... :-)
Linux user since early January 1992.
You are wrong about 3):
Source: http://archive.gamespy.com/e32002/pc/carmack/
And 4) as well:
Source: http://gcc.gnu.org/onlinedocs/gcc-4.2.1/gcc/G_002b _002b-and-GCC.html
Perfect hash functions are curiosities. If you have a static set of keys, then with enough work you can generate a perfect (i.e. collision-free) hash function. This has been known for many years. The applicability is highly limited, because you don't usually have a static set of keys, and because the cost of generating the perfect hash is usually not worth it.
Gperf might be reasonable as a perfect hash generator for those incredibly rare situations when the extra work due to a hash collision is really the one thing standing between you and acceptable performance of your application.
I thought maybe we were seeing a bad writeup, but no, it's the authors' themselves who talk about the need for high-performance command-line processing, and give the performance of processing N arguments as O(N)*[N*O(1)]. I cannot conceive of a situation in which command-line processing is a bottleneck. And their use of O() notation is wrong (they are claiming O(N**2) -- which they really don't want to do, not least because it's wrong). O() notation shows how performance grows with input size. Unless they are worrying about thousands or millions of command-line arguments, O() notation in this context is just ludicrous.
I don't know why I'm going on at such length -- the extreme dumbness of this article just set me off.
It sounds like the author is statically linking his library and running on embedded an embedded system. It is not surprising in that case that the c++ standard library brings in much more code than the c standard library, but it should be made clear that it is not relevant to desktop developers, pretty much all of which dynamically link with glibc.
Again, to be clear, dynamically linking with the c++ standard library is not going to increase your executable size. Please don't try to roll your own code that exists in the standard library. It is a real nuisance when people do that.
I should qualify that by saying that template instantiations do (of course) increase executable size, but that they do so no more than if you had rolled your own.
I'm not sure that for the usually simple task of command line processing, I'd like to learn a whole new lex/yacc syntax thingy.
0 4/09/1539255 and http://developers.slashdot.org/article.pl?sid=07/0 4/09/1539255
The syntax for gperf is not that bad, but its simply the wrong tool for the job as far as commandline processing goes.
gperf simply makes a "perfect" has function for searching a predetermined static lookup. It provides no mechanism for arbitrary arguments like input filenames or modifiers (like a filter for including/excluding things, or increasing/decreasing something) nor does it check for conflicting options or missing options.
gperf would give you nothing besides a match of input to a state. gperf would provide nothing for a common commandline like: --include="*.txt" --exclude="*.backup" --with-match="some text|or this text" --limit-input=5megabytes
getopt or just rolling your own if/else if ladder or switch statement would provide much more flexibility over gperf.
Now, with parsing a configuration file, gperf might help, but for processing commandline arguments, gperf is simply the wrong tool for the job.
This is like the second or third slashdot posting from IBM's developer works that is simply a well formated nonsense. Past examples are http://developers.slashdot.org/article.pl?sid=07/
This is silly on both slashdot and IBMs part.