Don't Overlook Efficient C/C++ Cmd Line Processing

Speed in options parsing? by tot · 2007-07-29 03:40 · Score: 5, Insightful

I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.

Re:Speed in options parsing? by ScrewMaster · 2007-07-29 03:42 · Score: 3, Insightful

I'd say the speed of human motor activity is an even greater limiting factor.

--
The higher the technology, the sharper that two-edged sword.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 03:44 · Score: 2, Insightful

It's still handy to have a fairly comfortable way of generating code that does things needed every time (or at least very, very often) in an easily applicable and very optimized way. I like it.
Re:Speed in options parsing? by ChronosWS · 2007-07-29 03:46 · Score: 4, Informative

Indeed, what the hell? Now you have to have another tool and another source file for what is essentially declaring a dictionary in C++, which should be in any good developer's library? Yeesh.

If you don't like the nasty nested ifs, make the keys in your dictionary the command line options and the values delegates, then just loop through your list of options passed on the command-line, invoking the delegate as appropriate. Eliminates the if, there are no switch statements either, and each of your command line arguments is now handled by a function dedicated to it, bringing all of the benefits of compartmentalizing your code rather than stringing it out in a huge processing function.
Re:Speed in options parsing? by pete-classic · 2007-07-29 03:53 · Score: 3, Informative

What a limited point of view. See "man system", for example.

-Peter
Re:Speed in options parsing? by hxnwix · 2007-07-29 04:02 · Score: 1

Except on Windows XP, where pipe performance degraded an order of magnitude as compared to Windows 2000.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 04:11 · Score: 5, Funny

You're not a real programmer if you won't over optimize unrelevant parts of your code.
Re:Speed in options parsing? by canuck57 · 2007-07-29 04:11 · Score: 3, Insightful

I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.
Your just experiencing this with Java, Perl or some other high overhead bloated program. People often pull out a heavy weight needing a 90MB VM or a 5-10MB basis library calling the cats breakfast of shared libraries I would agree, but lets take a look at C based awk for example, it is only a 80kb draw. Runs fast, nice and general purpose and does a good job of what it was designed to do. It can be pipelined in, out and used directly on the command line as it has proper support for stdin, sndout and stderr. On my system, only 10 disk blocks to load.
While fewer people are proficient at it, C/C++ will outlast us all for a language. Virtually every commodity computer today uses it in it's core. Many others have come and gone yet all our OSes and scripting tools rely on it. So any dooms day predictions would be premature, and if your want fast, efficient and lean code you do C/C++....
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 04:12 · Score: 2, Insightful

Indeed. The applications of perfect hashing (and minimal perfect hashing) are quite limited. Basically it only makes sense if you need to quickly identify strings from a fixed, finite set of strings known at compile time. And as with all optimizations, only if that part of your program is a bottle neck or you are prepared to optimize all other aspects of your program as well.

The traditional example application for perfect hashing was identifying keyword tokens when building a compiler, but for complex modern languages like C++ parsing source code is just a very tiny fraction of the compilation process. And even that scenario makes more sense than parsing command line options.

I doubt there is a single application that significantly benefits from hashed lookup of command line options. Suggesting that it makes sense to spend your time increasing the complexity of your application for a practically immeasurable improvement in performance is insanity.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 04:26 · Score: 0, Insightful

C/C++ will outlast us all for a language.
There's no such language as C/C++.
Re:Speed in options parsing? by eokyere · 2007-07-29 04:27 · Score: 1

as somebody already mentioned, speed in options parsing pretty useless and I could use commons-cli (in Java) and the groovy CliBuilder for cmdline options that arguably look cleaner and more accessible to a lot more ppl def cli = new CliBuilder(usage: "foo [args] baz") cli.i(argName:"path", longOpt:"input", args:1, required:false, "src") cli.o(argName:"path", longOpt:"output", args:1, required:false, "dest") cli.h(longOpt:"help", "this message") def options = cli.parse(args) if (!options || !options.i || options.h) { println "foobaz ver 0.0.1" cli.usage() return } // rest of code
Re:Speed in options parsing? by ai3 · 2007-07-29 05:11 · Score: 4, Funny

You must not have seen the recent proposal for GNU tools options, which will require four dashes instead of two and a minimum of four words per option. Under a UN/EU funded program to ease the transition to intelligent machines, developers are rewarded for implementing full-sentence options and/or prose. But initial experiments showed that many users where unwilling to wait for the parsing of the command "remove-files --recursively-from-root-directory --do-not-ask-for-confirmation-just-delete --i-really-want-this!" just to be 1337, which led to whatever development efforts are mentioned in the article, which I didn't read.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 05:23 · Score: 1, Funny

Yes, this will help maintainability but a consistent and standardized naming convention is helpful too.
function ifYouThoughtYouWereGoingToEditThisFileUsingATermin alBasedEditorThenYouNeedToThinkAgainOhYeahIIndentU singTabsToo() { /* */ }
Re:Speed in options parsing? by Maniac-X · 2007-07-29 06:43 · Score: 4, Funny

Klingon function calls do not have 'parameters' - they have 'arguments.' AND THEY ALWAYS WIN THEM!

--
(A)bort, (R)etry, (I)gnore?_
Re:Speed in options parsing? by ultranova · 2007-07-29 07:36 · Score: 2, Insightful

While fewer people are proficient at it, C/C++ will outlast us all for a language. Virtually every commodity computer today uses it in it's core.

Which is why they are so crash-prone. With C/C++, any mistake whatsoever will likely crash the program/machine, and possibly also allow crackers to make the program execute arbitrary code.

Many others have come and gone yet all our OSes and scripting tools rely on it. So any dooms day predictions would be premature, and if your want fast, efficient and lean code you do C/C++....

If you want fast, efficient and lean code, write it. Simply picking C/C++ doesn't make your code so, nor does not picking them make the code slow. What C/C++ does it make programs hard to port due to the ambigious definitions of some critical parts (such as type lengths), prone to crashing due to manual memory management, and dependant on external systems such as Gtk, Gnome or KDE for their graphical user interface due to C predating widespread adoption of computer graphics.

As long as our computers keep on depending on C, C++ or any other language with such horrendous features, except to see a new buffer overload or other such exploit every week. I, for one, welcome our new garbage-collected bounds-checked Java overlords which don't crash randomly as C programs do.

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 07:55 · Score: 1, Insightful

Type lengths in C/C++ can be specified with uint32_t, int8_t, etc. If they are not available for a certain platform, they are just a typedef away.
'dependant on external systems such as Gtk, Gnome or KDE', you don't know crap about programming, do you? There are libraries, not 'systems', and Gnome and KDE are not libraries (although there are gnome and KDE libraries). Most 'Gnome' programs are actually gtk programs (KDE programs are usually /true/ KDE programs, not just Qt). I'm not sure what you are suggesting, but I'm sure it's stupid. Even MS Windows no longer includes all the graphical system in the kernel.
If you think Java solves all the programming problems, you're nuts. It doesn't solve most of them, it just hides them; and it creates a whole host of new ones. And btw, Java doesn't include a 'graphical system', it just has a couple of libraries that can be used for that (and awt sucks majorly, swing is not so crappy but hardly a panacea).
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 09:00 · Score: 0

Hashing it is just so cool.
http://worsethanfailure.com/Articles/Classics-Week -How-Not-to-Parse-Command-Line-Arguments.aspx
Re:Speed in options parsing? by Millenniumman · 2007-07-29 09:30 · Score: 2

What is the problem with tabs? Are there any text editors/compilers that anyone uses that don't support tabs? I find them to be better than multiple spaces, even if the text editor has tab mapped to multiple spaces.

--
Stupidity is like nuclear power, it can be used for good or evil. And you don't want to get any on you.
Re:Speed in options parsing? by etnu · 2007-07-29 10:06 · Score: 1

"Fast" and "Efficient" usually means "C", not "C/C++". C++ (especially the STL) introduces way too much unnecessary bloat to be productive (not to mention the annoying debugging process). Applications which "need" C++ are usually better handled with Java or C# (depending on your platform). C will be here for a very, very long time, but C++ will probably die sooner rather than later.
Re:Speed in options parsing? by asuffield · 2007-07-29 10:14 · Score: 1

I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.

Your just experiencing this with Java, Perl or some other high overhead bloated program.

No, even with the most naive of command-line argument parsing code, it is highly unlikely that it will take a significant amount of time compared to the effort required for the kernel to fork off a new process, exec the binary, and for the dynamic loader to set it up for execution. This process typically takes several milliseconds - or more than a hundred if it has to be loaded from disk first.

Command line argument parsing is just not that hard, compared to the amount of work involved in spawning a new program.
Re:Speed in options parsing? by timeOday · 2007-07-29 10:50 · Score: 2, Insightful

I'd say the speed of human motor activity is an even greater limiting factor.

I wouldn't bet on that. The command line is not just a human/computer interface, but also a computer/computer interface. It's very common for one script to fire off many others.

That said, I agree with the grandparent that it's hard to imagine a program where command line processing is a significant runtime expense.
Re:Speed in options parsing? by sbryant · 2007-07-29 11:03 · Score: 2, Interesting
What is the problem with tabs?

The problem is that people set their tab breaks at all sorts of places (eg: every 4 characters), and then use tabs to space things in the middle of lines, or they'll mix tabs and spaces at the beginnings of lines. When somebody with different settings opens the same file, the indentation looks really screwed. That happens even after you've gotten everybody to agree on a common number of columns for indentation.

I only know of two solutions:
1. Make all software, everywhere, ever, use tab stops every eight characters and never anything else.
2. Use spaces.
I didn't have the energy to do the first, so I use the second solution.

If you're developing on your own it's not an issue, but I don't like to have one coding style here and another there - it's not just confusing, but it takes a while to change my editor settings every time I open code for somebody else. I use spaces and that's that. At least my editors are clever enough to know that Makefiles still need tabs!

-- Steve
Re:Speed in options parsing? by HeroreV · 2007-07-29 11:10 · Score: 2, Insightful
"Mixing tabs and spaces for indenting is bad. It causes many problems that you don't encounter when using only spaces. Therefore, tabs are bad, so only use spaces."

That is the only significant argument against tabs I've ever read, and I've probably read it a hundred times. Only a moron wouldn't realize that it's the mixing that is bad, not the tabs or the spaces, but apparently there are a lot of morons out there.

tabs: good
spaces: good
mixing tabs and spaces: bad

I personally prefer tabs. Why?
- Different code uses a different number of spaces for indenting, which makes copying between them more time consuming.
- Easier to change indention width. It's a simple change of an IDE preference instead of a risky text replace.
- Tabs are more semantic. That might sound stupid, but it gives me a warm fuzzy feeling.
Re:Speed in options parsing? by amightywind · 2007-07-29 12:14 · Score: 1

Use gcc much?

--
an ill wind that blows no good
Re:Speed in options parsing? by smittyoneeach · 2007-07-29 13:27 · Score: 2, Funny

Writing code--drudge work.
Writing code that writes code--now we're thinking!

--
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Re:Speed in options parsing? by Anonymous Coward · 2007-07-29 13:32 · Score: 0

Not everyone uses the same tab stops. Tabs are evil.
Re:Speed in options parsing? by VGPowerlord · 2007-07-29 14:13 · Score: 1

I would not consider speed of command line option processing to be bottleneck in any application, the overhead of starting of the program is far greater.

Have you tested this using getopt() and getopt_long() , or did you mean by parsing them manually?

--
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
Re:Speed in options parsing? by VGPowerlord · 2007-07-29 14:17 · Score: 3, Insightful

Not everyone uses the same tab stops.

I see that as a good reason to use tabs. Don't like how far it's indented? Change how wide your editor displays tabs.

--
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
Re:Speed in options parsing? by Millenniumman · 2007-07-29 15:03 · Score: 1

Huh, that's one of the reasons I like tabs. People can choose different tab stop sizes to fit their preference, without changing source files. I guess I can see that not working well if tabbing is done strangely, or mixed with spaces. I mostly leave that to my editor so it's done fairly well.

--
Stupidity is like nuclear power, it can be used for good or evil. And you don't want to get any on you.
Re:Speed in options parsing? by fractoid · 2007-07-29 15:33 · Score: 1

Over-optimise unrelevant parts of code? Unpossible!

--
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
Re:Speed in options parsing? by TehZorroness · 2007-07-29 16:51 · Score: 1

<blockquote>While fewer people are proficient at it, C/C++ will outlast us all for a language.</blockwuote> While that is true, I think that as time progresses, the utility of C and C++ will fade as more programming languages develop and are better integrated with their host systems and each other. The only reason they are prefered today is that they hold a niche in UNIX operating systems that no other language really fits into. I have a sense that the long dead LISP Machines of yesterday may make their return someday, only with more ass-kicking power. (imagine C++, Ruby, and LISP coexisting peacefully without glue) The sandbox many VMs and interpreted languages offer make programming more about mathematics and design, and less about paying attention to every last technical detail and working around oversights made in 20 year old specifications. Unfortunately, each interpreter or VM is off in it's own world and you need to do horrid glue work to get the slightest level of linkage. A large step in the future history in development of computer operating systems will be when the barriers of each language offering it's own ABI vanish and each program doesn't need to be a static black box any more. We have the cars, we just need roads we can drive them on.
Re:Speed in options parsing? by muridae · 2007-07-29 21:29 · Score: 2, Funny

Writing code that writes code--now we're thinking!
But what could we call this code, a compiler? Nah, I think we need to think of another word for it.
Re:Speed in options parsing? by Ahruman · 2007-07-29 23:24 · Score: 1

I, for one, only make cromulent optimizations.
Re:Speed in options parsing? by ultranova · 2007-07-30 01:47 · Score: 1

Type lengths in C/C++ can be specified with uint32_t, int8_t, etc. If they are not available for a certain platform, they are just a typedef away.

In other words, if you take great care you can write C/C++ code which might work under another platform with only minimal modifications.

'dependant on external systems such as Gtk, Gnome or KDE', you don't know crap about programming, do you? There are libraries, not 'systems', and Gnome and KDE are not libraries (although there are gnome and KDE libraries).

That was quite incoherent.

I'm not sure what you are suggesting, but I'm sure it's stupid.

I think this sums your post nicely.

Even MS Windows no longer includes all the graphical system in the kernel.

What does this has to do with C/C++ ?

If you think Java solves all the programming problems, you're nuts. It doesn't solve most of them, it just hides them; and it creates a whole host of new ones.

Of course Java doesn't solve all programming problems, but it does solve the buffer overruns, dangling pointers and arbitrary code execution problems which plague C/C++.

And btw, Java doesn't include a 'graphical system', it just has a couple of libraries that can be used for that (and awt sucks majorly, swing is not so crappy but hardly a panacea).

Pray tell, what do you think a "graphical system" means ?

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Speed in options parsing? by JesseMcDonald · 2007-07-30 05:04 · Score: 3, Informative

Writing code that writes code--now we're thinking!

But what could we call this code, a compiler? Nah, I think we need to think of another word for it.

How about "macro"?

--
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
Re:Speed in options parsing? by JesseMcDonald · 2007-07-30 05:23 · Score: 1

>> I guess I can see that not working well if tabbing is done strangely, or mixed with spaces. I mostly leave that to my editor so it's done fairly well. Ever try changing your tabstops? That's where leaving formatting up to your editor can get you in trouble. Tabs <em>will</em> make a mess of things unless you either mandate that everyone use the same size of tabstops, or don't care about vertical layout at all. For example, if you take the following code with five-character tabstops: [tab]float asdfg = 40.4, [tab][tab] bnmvc = 3.7, [tab][tab][tab]c = 15.8; and display it in an editor using eight-character tabstops, you get this: [tab ]float asdfg = 40.4, [tab ][tab ] bnmvc = 3.7, [tab ][tab ][tab ]c = 15.8; which is nothing like the nice column layout you intended. It gets worse if tabs are placed in the middle of a line rather than just the beginning, or if the code is edited with a mix of different tabstop settings.

--
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
Re:Speed in options parsing? by Anonymous Coward · 2007-07-30 05:45 · Score: 0

I would actually like java and other modern VM based systems with two minor provisions.

They quit using glorified goto statements for exception handling and there be some way to guarantee operations can't fail unless the processor and related northbridge melts.

Trying to write server codes in an environment where anything can happen even when performing basic math any manner of exceptions can be thrown from any black box is unhelpful and annoying.

At least /w C programs if there is a programming mistake the software ususally dies. Java programs tend to generate stack traces and go on their merry way pretending nothing just happened. I've seen many commercial applications rewritten in java doing just that on a regular basis. I don't know about you but that scares the hell out of me. C/C++ forces better Emphasis on QA and code quality because crash errors almost always equals = game over. I certainly have never experienced Java codes having better reliability than the same software rewritten in Java. The promise of Java in my experience breaks down when you start adding necessary reliability constraints to important server codes.

For basic desktop applications and UI work on the other hand I reckon the experience is quite a bit different.
Re:Speed in options parsing? by JacobO · 2007-07-30 06:39 · Score: 1

Command line argument parsing is just not that hard, compared to the amount of work involved in spawning a new program.

Command-line parsing may be lightweight. I would say it has a lot to do with the program's requirements. A reasonable generalization is to say that it is typically not compute-intensive so should not be considered a risk for performance problems.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-30 06:42 · Score: 0

>Not everyone uses the same tab stops.

>>I see that as a good reason to use tabs. Don't like how far it's
>>indented? Change how wide your editor displays tabs.

That's fine if you're writing code for no one but yourself, but if you
email your code, post it on the web, or submit it to an open source
project, you can't be sure your carefully indented code will be
received or viewed in exactly the way you expect. The only way to be
certain is never to use embedded tab characters.

It amazes me how people cling to tabs, as if they're somehow cool or
"right". There was a time when that made some sense, when disk space
was a critical issue. Those days are long, long over, however.
Indentation in coding is really a matter of clarity. Think of it as
syntax, the same as in any form of writing. You wouldn't take a
chance on how your readers perceive semicolons, question marks, or
periods. Why would you run a risk with indentation?

As for using tabs to change your indentation in one fell swoop, do you
really change your mind that often? I've written C/C++ code
practically every day for 30 years. The last time I changed my
indentation interval was about 15 years ago, when I switched from 8
spaces to 4. Just settle on an interval that suits you, stick with
it, and NEVER use tab characters. You'll never have to think about it
again and you can rest easy that your readers will see what you intend
them to see.
Re:Speed in options parsing? by Anonymous Coward · 2007-07-31 16:52 · Score: 0

It amazes me how people cling to tabs, as if they're somehow cool or "right"

I know this may be considered a different discussion, but when using a language such as Python (which is internally based on C), indentation determines the block to which the current code line belongs. Therefore, tabs are not always used because they are cool or "right" but because they would be required to have the code actually work and work right.
Re:Speed in options parsing? by Bill+Dog · 2007-07-31 17:41 · Score: 1

Of course Java doesn't solve all programming problems, but it does solve the buffer overruns, dangling pointers and arbitrary code execution problems which plague C/C++.

There is no "C/C++" programming language. C++ solves those with STL containers, references, and smart pointers. C++ also solves the Java problems of forgetting to release resources other than memory, such as a database connection, with its deterministic destruction and RAII idiom. So, in order of safety, from worst to best, it's C, Java, and then C++.

To be fair, however, I must say, an awful lot of C++ programmers program it in the C way. That is, while there's really no "C/C++" programming language, tons of people nevertheless are basically programming in it! And do indeed experience all the problems you mention.

--
Attention zealots and haters: 00100 00100
Re:Speed in options parsing? by Nicolay77 · 2007-08-01 10:57 · Score: 1

The problem with tabs is that a bunch of Lisp hackers wrote a document ages ago that openly criticized tabs.

That said, tabs are a really bad idea in Lisp source code, because lisp needs a fine grained indentation scheme.

However, tabs are great in C/C++/Java etc. I love and use tabs extensively (except in Lisp).

Anyone that think tabs are bad has been indoctrinated by this anti-tabs document without really understanding the language for which it was written.

--
We are Turing O-Machines. The Oracle is out there.
Re:Speed in options parsing? by Crazy+Eight · 2007-08-06 00:52 · Score: 1

I can understand why C would seem vulnerable to criticism in respect to ambiguous type sizes, but hindsight is 20/20, types larger than a byte relate to the machine architecture by design, and stdint.h (or homegrown alternatives) provide discrete specification. "C" is called a "portable assembler" because it cuts right to the bone and deals with what von Neumann architecture machines do. Other languages aren't more portable. They've just done the porting for you.
In any event your argument fizzled into Candyland when you brought GTK+ et al into it. Speech synthesizers would do nothing to the relevance of any language you might trumpet as more modern.

Too much by bytesex · 2007-07-29 03:41 · Score: 3, Insightful

I'm not sure that for the usually simple task of command line processing, I'd like to learn a whole new lex/yacc syntax thingy.

--
Religion is what happens when nature strikes and groupthink goes wrong.

Re:Too much by hackstraw · 2007-07-29 11:26 · Score: 4, Insightful

I'm not sure that for the usually simple task of command line processing, I'd like to learn a whole new lex/yacc syntax thingy.

The syntax for gperf is not that bad, but its simply the wrong tool for the job as far as commandline processing goes.

gperf simply makes a "perfect" has function for searching a predetermined static lookup. It provides no mechanism for arbitrary arguments like input filenames or modifiers (like a filter for including/excluding things, or increasing/decreasing something) nor does it check for conflicting options or missing options.

gperf would give you nothing besides a match of input to a state. gperf would provide nothing for a common commandline like: --include="*.txt" --exclude="*.backup" --with-match="some text|or this text" --limit-input=5megabytes

getopt or just rolling your own if/else if ladder or switch statement would provide much more flexibility over gperf.

Now, with parsing a configuration file, gperf might help, but for processing commandline arguments, gperf is simply the wrong tool for the job.

This is like the second or third slashdot posting from IBM's developer works that is simply a well formated nonsense. Past examples are http://developers.slashdot.org/article.pl?sid=07/0 4/09/1539255 and http://developers.slashdot.org/article.pl?sid=07/0 4/09/1539255

This is silly on both slashdot and IBMs part.
Re:Too much by Anonymous Coward · 2007-07-30 00:32 · Score: 0

gperf has another problem if you're not writing GPLed code. What is the license of the code it generates? Bison has an extra license clause making it clear that its generated code can be used in non-GPL projects, but gperf just uses the standard GPL.

Joke? by Anonymous Coward · 2007-07-29 03:42 · Score: 0

This has to be a joke? Sheesh. Someone found a "new" toy?

Re:Joke? by iangoldby · 2007-07-29 04:20 · Score: 4, Insightful

Someone found a "new" toy?
Well I for one won't be using this to process command-line arguments (that's what getopt() and getopt_long() are for), but it is certainly useful to know of a tool that I can use to generate a perfect hash. The next time I need some simple but efficient code to quickly discriminate between a fixed set of strings, I'll know to Google for gperf. (Before I read this article I didn't even know it existed.)
Re:Joke? by Anonymous Coward · 2007-07-29 04:49 · Score: 0

probably H1B PhD.
Re:Joke? by Anonymous Coward · 2007-07-29 05:03 · Score: 0

> probably H1B PhD.

Yeah, Bill Gates just can't get enough of these guys. Put 5000 in a room for a year and they'll bang out total crap that'll you'll be forced to sell to reclaim your 'investment'. If you're really unlucky, they may even leave you with a real stinker like Windows Vista.
Re:Joke? by bumby · 2007-07-29 05:54 · Score: 1

this was actually what I first thought it was to be used for, before I read the comments. I thought the preview was about how complicated commandline tools were to use with all there options, and gpref was an example of such a program.

--
Hey! That's my sig you're smoking there!

Maybe overkill BUT... by Derek+Loev · 2007-07-29 03:47 · Score: 1

it does create some good-looking code.

Yeah, because getopt(3) is a real bottleneck by V.+Mole · 2007-07-29 03:50 · Score: 4, Insightful

Does the phrase "reinvent the wheel" strike a chord with anyone?

Re:Yeah, because getopt(3) is a real bottleneck by seebs · 2007-07-30 06:08 · Score: 1

To be fair, standard getopt is sorta mediocre. Like everyone, I wrote my own which handled long options, option arguments, and more. Mine's somewhere in the same general class as popt and GNU getopt_long; I suspect that, had I been a decade older when I'd written it, I wouldn't have written it.

I can't imagine using gperf for the task, though. In fact, I've spent real time fixing programs that could take "-a -b" but not "-ab".

--
My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/

Re:C++ I get by Anonymous Coward · 2007-07-29 04:01 · Score: 4, Insightful

I do. On MIPS, ARM, PPC, x86, and all the other embedded stuff. I don't think C will ever die - it's the universal assembler language.

And the standard says... by Anonymous Coward · 2007-07-29 04:02 · Score: 5, Insightful

Good grief. What a strawman of an example.
Anyone writing or maintaining command line programs knows that they
should be using the API getopt() or getopt_long().
There are standards on how command line options and arguments are to be
processed. They should be followed for portability and code maintenance.

Re:And the standard says... by iangoldby · 2007-07-29 04:14 · Score: 0

Anyone writing or maintaining command line programs knows that they
should be using the API getopt() or getopt_long()...
Someone please mod parent up (not this).
Re:And the standard says... by The+Vulture · 2007-07-29 04:36 · Score: 1

From what I can see in the article, it's not meant to replace getopt/getopt_long.

I am currently writing an application (for my employer) where this may be useful. Although it also uses command line parameters (via getopt_long), it also receives commands in ASCII over a network connection - that is what I believe this article targets.

Because the commands I receive can have almost any series of parameters in any sequence however, I prefer to do what another poster here already stated - you look for keywords in a lookup table, and then call a function to handle whatever keywords come up afterwards. The suggestion of the article is that rather than iterating on a lookup table, you can use a hashing function to more quickly determine which keyword you are looking at.

The extra complexity of this method however (having to use extra tools) makes me lean towards simple iteration - easier to code, and when you add a new token, it's a minimal change.

-- Joe
Re:And the standard says... by Frankie70 · 2007-07-29 05:05 · Score: 1

Anyone writing or maintaining command line programs knows that they
should be using the API getopt() or getopt_long().

There is no getopt or getopt_long in the C or C++ standard.
Re:And the standard says... by JNighthawk · 2007-07-29 08:14 · Score: 2, Informative

Yes, because we should be using functions that are NOT IN THE STANDARD to maintain portability.

Oh, and as far as I know, those functions aren't in VC++, which is what a hefty chunk of C/C++ development is done on.

--
Wheel in the sky keeps on turnin'.
Re:And the standard says... by jlarocco · 2007-07-29 10:36 · Score: 1

There is no getopt or getopt_long in the C or C++ standard.

getopt is in Posix.

getopt_long is a GNU extension, though

--
Maybe not
Re:And the standard says... by Anonymous Coward · 2007-07-30 12:09 · Score: 0

getopt_long is a GNU extension, though

So using getopt_long locks you to LGPL or GPL?
Re:And the standard says... by jlarocco · 2007-07-30 12:59 · Score: 1

So using getopt_long locks you to LGPL or GPL?

Sigh. No.

--
Maybe not
Re:And the standard says... by Anonymous Coward · 2007-07-31 01:50 · Score: 0

If getopt_long is not LGPL or GPL and is a GNU extension, how is it licensed?

Functional Programming rules the world by Anonymous Coward · 2007-07-29 04:03 · Score: 0

OCaml for the win

Re:C++ I get by V.+Mole · 2007-07-29 04:04 · Score: 5, Funny

There's this little project of which you may have heard: http://www.kernel.org/

Broken handling of vtables in linkers by tepples · 2007-07-29 04:06 · Score: 4, Informative

Now you have to have another tool and another source file for what is essentially declaring a dictionary in C++, which should be in any good developer's library? Due to the brokenness of how some linkers handle virtual method lookup tables, using anything from the C++ standard library tends to bring in a large chunk of dead code from the standard library. I compiled hello-iostream.cpp using MinGW and the executable was over 200 KiB after running strip, compared to the 6 KiB executable produced from hello-cstdio.cpp. Sometimes NIH syndrome produces runtime efficiency, and on a handheld system, efficiency can mean the difference between fitting your app into widely deployed hardware and having to build custom, much more expensive hardware.

Re:Broken handling of vtables in linkers by adah · 2007-07-29 16:57 · Score: 1

I compiled hello-iostream.cpp using MinGW and the executable was over 200 KiB after running strip, compared to the 6 KiB executable produced from hello-cstdio.cpp.

This is a problem of code optimization. Developers on the PC seem to pay less and less attention to code size now. It did not use to be the case. GCC 2.95.3 w/ SGI STL will produce HelloWorld.exe a few dozens of kilobytes, if my memory serves me right.

This said, I do not feel it a serious issue. When developing for embedded system, where storage/memory restrictions are high, a serious developer has to get some decent libraries optimized for code size. In fact, even with the standard C++ library implementation of MinGW GCC 3.4.5, using just std::map does not increase the code as much as std::cout. In my case, it is 30+ kilobytes (compared to 200+ in the case of iostream).
Re:Broken handling of vtables in linkers by Alioth · 2007-07-29 23:30 · Score: 1

I'm currently developing some embedded code. I have 2K of program flash to play with, and a whopping 128 bytes (not K, not meg, 128 bytes, i.e. 0.1K) to play with. This is not at all uncommon on embedded devices.

--
Oolite: Elite-like game. For Mac, Linux and Windows

Re:C++ I get by hxnwix · 2007-07-29 04:08 · Score: 1, Interesting

Oh, gee, well, nobody except:

1) Every linux kernel developer
2) Every *BSD kernel developer
3) John Carmack, for the core of every ID engine up to and possibly beyond Doom3
4) You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64).

Equivalent Python by Anonymous Coward · 2007-07-29 04:10 · Score: 0, Informative

import sys def function_1 (...): ... functions = {'a': function_1, 'b': function_2, 'c': self.method_1, ...} func = functions[value] if __name__ == '__main__': args = sys.argv[1:] func(args) # The variable "functions" is set to a Python dictionary. # Built-in dictionaries already use fast hash-table lookups.

Re:Equivalent Python by Anonymous Coward · 2007-07-29 04:16 · Score: 0

(1) Python dict() does not use a perfect hash function.

(2) In your example, the dictionary is built online rather than being compiled into the program.

(3) Your chosen language has no support for buffer overflows and is far too easy to understand and maintain.
Re:Equivalent Python by Anonymous Coward · 2007-07-29 05:03 · Score: 0

Pretty. However, if your purpose was to somehow show how Python is "superior" when it comes
to parsing command-line options, uhhhh... get real. Who cares? After all, my special language
simply handles command-line options with no code at all. It just figures it out from the options
you ask for in the program. MUCH smaller than your Python code. And this means..?

If, on the other hand, you were more interested in demonstrating how a Python program with
nice command line handling might talk to C or C++ for some function, I applaud you and
also recommend that you also explore:

* Boost::program_options
* Boost::Python
* SWIG
* Shed Skin (Python -> C++ compiler) ...have fun!
Re:Equivalent Python by Anonymous Coward · 2007-07-29 05:37 · Score: 0

But I like pretty. After using C++, Java, PHP, and ugghhh...VB for a while, I finally got around to learning Python...and I'm hooked. I haven't gotten to the point of handling command line options in the 2 wxPython apps I'm working on. When I saw the C++ code, I looked at 3 Python examples and synthesized the Python version. And posted it, cause it was so pretty.
If, on the other hand, you were more interested in demonstrating how a Python program with
nice command line handling might talk to C or C++ for some function, I applaud you and
also recommend that you also explore:

* Boost::program_options
* Boost::Python
* SWIG
* Shed Skin (Python -> C++ compiler) ...have fun! Nah to do that I'd have to import the ctypes library. It would have added a few LOC. I'm interested in integrating C/C++ code if necessary for performance, but I haven't focused on that yet, preferring pure Python for its simplicity.

But, similar to Shed Skin, PyPy is pretty nifty. It's currently Python written on top of Python, but you can "translate" the high-level code to C, .Net, JVM, even Javascript (!). It's very similar to the pseudo-code and code generators in books like the Pragmatic Programmer. Crazy.

Re:C++ I get by iangoldby · 2007-07-29 04:11 · Score: 2, Insightful

I use C for any low-level programming project that doesn't warrent an object-oriented approach.

The trick is to identify the best tool for the job.

I'm doing it. by www.sorehands.com · 2007-07-29 04:15 · Score: 1

I'm currently rewriting Post Road Mailer, which is in C on OS/2. I also wrote a e-mail scanner. It all depends on what you need to do.

I did a phone interview for a job a couple of years ago. Remote underwater sensor equiptment. Had to run on battery, you think they would have written in in C or C++? It would once in a while turn on the hard drive one the flash drive was full.

The more you abstract something, the less efficient it becomes.

There are millions of lines of COBOL code still running.

"The Jenolan could probably fly rings around the Enterprise on impulse." Geordi LaForge.

--
Fight Spammers!

Re:I'm doing it. by DreadSpoon · 2007-07-29 04:50 · Score: 1

The more you abstract something, the less efficient it becomes. This is not at all true, especially not today. I'd trust an abstract container library to optimize its internals far more than I'd trust you or almost any other individual developer to do the same.

I trust my C compiler to get the vary many high-level optimizations required by today's CPUs right than I'd trust you or almost any other individual developer to do the same.

Yeah, sometimes those high level libraries or languages get things wrong, but that's not a given just because they're more abstract. It's merely an implementation bug.

If you had some C code that was being inefficiently compared to assembler code, then you just don't know how to write efficient C code or you were using a shit compiler.
Re:I'm doing it. by WuphonsReach · 2007-07-29 16:26 · Score: 1

I'm currently rewriting Post Road Mailer, which is in C on OS/2.

I used that mailer! I think I still even have a bunch of e-mails still in that format.

--
Wolde you bothe eate your cake, and have your cake?
Re:I'm doing it. by nonos · 2007-07-29 21:34 · Score: 1

In asm, you can modify the code on the fly, this is one of the reasons you can make asm code faster than c code, whatever the quality of the compiler.

My hand optimized code is always 20x faster than C, and I don't think my compiler nor c coding skills suck.
Re:I'm doing it. by David+Greene · 2007-07-30 03:33 · Score: 1

In asm, you can modify the code on the fly

You can do the same with C and C++.

--
Re:I'm doing it. by nonos · 2007-07-31 03:09 · Score: 1

Thanks for the info ! I'll give it a try.

Joke? by Anonymous Coward · 2007-07-29 04:16 · Score: 0

What kind of joke is this? The example in listing 1 is using strtok() to do something it can't do, and even if it did what the authors intended, they wrote comments documenting something else.

Correction... by Pedrito · 2007-07-29 04:19 · Score: 1, Insightful

Just about any relatively complicated software has dozens of available command-line options.

That should probably be rephrased to "Just about any relatively complicted software that inflicts command-lines on its users..."

This is clearly a very unix oriented post as there are relatively few command-line windows apps and few window GUI apps that accept command-lines. But this is also a topic that's about as old as programming itself and clearly something that takes the "new" out of "news".

Re:Correction... by AuMatar · 2007-07-29 04:36 · Score: 1

Umm, most Windows apps accept command line inputs- its just not the default way of using it. But type it in at the command line and you'd be surprised. A few that come to mind- VC++'s compiler and internet explorer.

--
I still have more fans than freaks. WTF is wrong with you people?
Re:Correction... by Ambiguous+Puzuma · 2007-07-29 04:40 · Score: 1

You might be surprised. Command line options may not be featured prominently in Windows applications, but that doesn't mean they're not there. If you have Microsoft Visual Studio, for example, try "devenv /?" sometime. (For non-Windows users: Devenv.exe is the executable to start Visual Studio's IDE.)
Re:Correction... by Anonymous Coward · 2007-07-29 04:42 · Score: 1, Interesting

Hardly the case. Most of the win32 shit I've used accepts command lines. It's much simpler and a more powerful debugging tool then to force a config file change for every attempt.
Re:Correction... by Maniac-X · 2007-07-29 07:05 · Score: 1

That's not true. Most Windows programs accept command-line arguments (just take a look at ANY game, as a matter of fact); they're simply not used often because most Windows users a) don't know they exist, b) wouldn't know how to do it without some detailed instruction, and c) would probably not see the point in trying it anyway.

--
(A)bort, (R)etry, (I)gnore?_
Re:Correction... by Anonymous Coward · 2007-07-29 13:54 · Score: 0

something that takes the "new" out of "news".
Thus leaving us with "s". Add some wildcards, and it becomes "s***". I do believe you are correct.
Re:Correction... by Anonymous Coward · 2007-07-29 17:04 · Score: 0

You, Sir are a "classical windiot".

I'll explain: Most of the (even proficient) Windows users don't know about their platform.

Of course most Windows programs have command line options -- and that is *good*, because that's the way for the sysadmin and the power user to customize the behaviour of an application icon.

Maybe that's what I hate most about Windows. It ain't Microsoft, it is this culture of ignorance.

Grmbl.
Re:Correction... by DragonWriter · 2007-07-30 11:35 · Score: 1

This is clearly a very unix oriented post as there are relatively few command-line windows apps and few window GUI apps that accept command-lines.

I dunno what you consider "few" but lots of Windows GUI apps I've encountered accept command-line arguments. Most users don't recognize it because they just click the pretty icon, of course, but that doesn't mean they aren't there.

Re:C++ I get by Anonymous Coward · 2007-07-29 04:22 · Score: 1, Interesting

http://www.tiobe.com/tpci.htm/

Re:C++ I get by mce · 2007-07-29 04:25 · Score: 4, Interesting

You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64).

Excuse me???? That was not even true anymore when I started using C++, back in 1992. There are features in the C++ standard that are so extremely difficult to correctly implement in standard compliant C that it's a complete waste of effort trying to pass via C while compiling. Exception handling comes to mind as the prime example. A failed attempt to support exceptions was the reason why Cfront 4.0 was abandoned. Note that 3.0 was released as early as 1991. The last Cfront based compiler I had the horor of using was HP's CC. It was superseeded by the new native aCC by 1994 at the latest.

By the way, I used to write C/C++ compilation/optimisation stuff for a living, so I guess I know something about the topic.... :-)

--
Linux user since early January 1992.

It is if the linker complains about not finding it by tepples · 2007-07-29 04:29 · Score: 4, Informative

Yeah, because getopt(3) is a real bottleneck getopt() is in the header <unistd.h>, which is in POSIX, not ANSI. POSIX facilities are not guaranteed to be present on W*nd?ws systems. It also handles only short options, not long options. For those, you have to use getopt_long() of <getopt.h>, which isn't even in POSIX.

Does the phrase "reinvent the wheel" strike a chord with anyone? If the wheel isn't licensed appropriately, copyright law requires you to reinvent it. Specifically, using software under the GNU Lesser General Public License in a proprietary program intended to run on a platform whose executables are ordinarily statically linked, such as a handheld or otherwise embedded system, is cumbersome.

I agree... by SuperKendall · 2007-07-29 04:31 · Score: 2, Insightful

There's a time and place for gperf - command line argumnet processing is not it!

Actually, I've never really come across a case where I knew ahead of time the whole universe of strings I would be accepting, and so never ended up using it - gperf is a great idea, but this seems to be a case of someone really looking hard to figure out where they could shoehorn gperf into just for the sake of using it.

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley

Re:I agree... by thogard · 2007-07-29 06:04 · Score: 1

This whole discussion reminds me of the often quoted phrase "Premature optimisation is the root of all evil" but you bring up an interesting point that I disagree with.
There is a place for gperf in command line processing, its just not for production programs. It is fine for experimental programs as a training exercise.

Re:C++ I get by Enselic · 2007-07-29 04:33 · Score: 4, Informative

You are wrong about 3):

The process of building the new engine went much more smoothly than anything we have done before, because I was able to do all the groundwork while the rest of the company worked on TeamArena. By the time they were ready to work on it, things were basically functional. I did most of the early development work with a gutted version of Quake 3, which let me write a brand new renderer without having to rewrite file access code, console code, and all the other subsystems that make up a game. After the renderer was functional and the other programmers came off of TA and Wolf, the rest of the codebase got rewritten. Especially after our move to C++, there is very little code remaining from the Q3 codebase at this point.

Source: http://archive.gamespy.com/e32002/pc/carmack/

And 4) as well:

Historically, compilers for many languages, including C++ and Fortran, have been implemented as "preprocessors" which emit another high level language such as C. None of the compilers included in GCC are implemented this way; they all generate machine code directly. This sort of preprocessor should not be confused with the C preprocessor, which is an integral feature of the C, C++, Objective-C and Objective-C++ languages.

Source: http://gcc.gnu.org/onlinedocs/gcc-4.2.1/gcc/G_002b _002b-and-GCC.html

All the world is not a PC by tepples · 2007-07-29 04:34 · Score: 5, Insightful

HOLY SHIT! 194KB BIGGER?! HOW WILL YOU EVER FIND THE SPACE FOR SUCH A HUGE EXECUTABLE?!?! I develop for a battery-powered computer with 384 KiB of RAM. In such an environment, what you appear to sarcastically call a "mere couple hundred kilobytes" is a bigger deal than it is on a personal computer manufactured in 2007.

Re:All the world is not a PC by sholden · 2007-07-29 05:05 · Score: 1

And you do so using MinGW and c++?
Re:All the world is not a PC by Urusai · 2007-07-29 05:42 · Score: 0, Flamebait

I'd say a bigger deal is your pretentious use of kibibytes (KiB).
Re:All the world is not a PC by Anonymous Coward · 2007-07-29 06:00 · Score: 1, Interesting

"I develop for a battery-powered computer with 384 KiB of RAM. In such an environment, what you appear to sarcastically call a "mere couple hundred kilobytes" is a bigger deal than it is on a personal computer manufactured in 2007."

I fail to see how is this strong argument in this discussion. How many of these embedded tools you write actually _do_ command line processing? If they do, why don't you invest in more (both memory- and time-) efficient ways to do IPC than the command line?
Re:All the world is not a PC by GrievousMistake · 2007-07-29 09:03 · Score: 1

If he's working on a system where size matters, he'll want to use precise terms to describe it. That's hardly pretentious, any more than NASA being meticulous to specify whether they're using imperial or metric units.

--
In a fair world, refrigerators would make electricity.
Re:All the world is not a PC by jschimpf · 2007-07-29 09:28 · Score: 1

We are on the second iteration of an embedded system (first pass on Linux, second pass on a propriety OS) in each we start a number of applications. These apps are all the same executable but command line options tell the app where to find info on which .so's (DLL's) it loads to give it a personality. the options also tell a particular app its name and information on where it's file system lives. In this case command line parsing is quite important. We use a simple system which locks argc,argv into globals in each app and then we have a function that can find a particular flag and associated value at any point in the code. This way we don't have to parse the command line in main() although some happens there but wherever we need to get a particular flag we can access it. We come down on the minimize code size/RAM size vs efficiency in our choices as none of these flag searches is done more than once and don't have any visible effect on program speed.
Re:All the world is not a PC by siride · 2007-07-29 10:34 · Score: 2, Informative

A kilobyte means 1024 bytes among programmers. Any programmer that doesn't know that would likely not know what a kibibyte is either.
Re:All the world is not a PC by Anonymous Coward · 2007-07-29 11:16 · Score: 0

> A kilobyte means 1024 bytes among programmers.

A kilobyte means 1000 bytes when referring to hard drive capacity. Yes, I know it's stupid, but that's the facts of life.
Re:All the world is not a PC by mikael · 2007-07-29 13:30 · Score: 1, Funny

I thought that read kibobytes, or maybe even kibblebytes

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Re:All the world is not a PC by Anonymous Coward · 2007-07-29 14:07 · Score: 0

OK, now that I've taken the bad, where's the good?
Re:All the world is not a PC by peterpi · 2007-07-29 23:13 · Score: 1

Tabs are for indenting, spaces are for lining up. It's crazy that only about 5% of the programmers I've ever met seem to get the concept.

Getting your editor to display tabs as something other than " " helps. I use ">---" on vim. If you press the wrong key, it looks wrong on the screen.
Re:All the world is not a PC by peterpi · 2007-07-29 23:40 · Score: 1

Bollocks, wrong thread. See what tabbed browsing does for you!?
Re:All the world is not a PC by trolltalk.com · 2007-07-30 09:02 · Score: 1

"or maybe even kibblebytes"
... suitable for loading and programs that run like a dog ...

--
Kevin Smith on Prince

Re:C++ I get by AuMatar · 2007-07-29 04:34 · Score: 1

Pretty much every embedded program in existence. Own a printer? Thats several hundred thousand lines of C in there.

--
I still have more fans than freaks. WTF is wrong with you people?

Re:C++ I get by mechsoph · 2007-07-29 04:36 · Score: 1

You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64)

GCC parses C++ to it's tree IR; there is no translation to C.

Re:C++ I get by Anonymous Coward · 2007-07-29 04:40 · Score: 0

thing is, 'the world' is built an C/C++ and this won't change soon, everywhere you look it's C/C++ libs and stuff.
i'm trying desperately to move away from C++, it's dusty and a hell of a language with loads of problems BUT the average neighborhood library has a C (or C++) interface. So either you fiddle around with more or less weird X to C call libraries, or you stick with C/C++. sad but that's the way it is ;/

Not for command lines ... by FrnkMit · 2007-07-29 04:51 · Score: 1

I haven't even read TFA, but I know that gperf isn't for command lines; getopt() (in its various forms) more than adequately does its job.

One real use of gperf and perfect hashes that I know of is in TAO (The ACE ORB), an implementation of CORBA. Since CORBA includes the class and method names as strings, a perfect hash speeds up each lookup of the actual routine to call.

In modern times, I can imagine gperf (or a Java/C#/Ruby/whatever port) speeding up SOAP or other XML-based protocols.

I like gperf, but... by Anonymous Coward · 2007-07-29 04:53 · Score: 0

...it's more than a little pointless to use it for command-line options, especially in C++. For
one thing, as others have pointed out, I have a hard time imagining a case in which command-line
parsing is a real bottleneck for any application. And, given that that's the case, having to write
lots of special functions and use extra tools for something that is a problem solved well through
freely-available libraries seems like something of a waste of time. I assume that the true purpose of
the article was to remind people of gperf.

Respectfully to the IBM authors, you might as well just use lex and perhaps yacc if you're
dealing with C and need to write a parser, or a library that does a much better job of handling
command-line options (such as GNU getopt) and their problems which range far beyond merely parsing
things.

With C++, you have available those libraries as well, but if you want to try other approache, Boost
("http://www.boost.org") has a very nice command-line option library that also sports an expressive
notation for describing the options in code.

In any case, it's nice to see an article on gperf, but here it felt somewhat rediculously applied.

Wrong about 4 (or at least, very out of date) by jdennett · 2007-07-29 04:55 · Score: 1

It's been many years since most C++ compilers used C as an intermediate language. CFront did, and some EDG-based compilers do, but most current C++ compilers do not.

C does have its strengths, such as the relative simplicity of C90 and its lack of dependency on sophisticated compilers and runtimes, but its use as an IL is largely historical.

Re:C++ I get by hxnwix · 2007-07-29 05:04 · Score: 1

I used to write C/C++ compilation/optimisation stuff for a living, so I guess I know something about the topic.... Good guess. Name decoration and limited knowledge of c++'s origins led me to conclude that most C++ compilers still act as front ends. So, we don't all use C anymore...

Wrong in so many ways by geophile · 2007-07-29 05:05 · Score: 4, Insightful

Perfect hash functions are curiosities. If you have a static set of keys, then with enough work you can generate a perfect (i.e. collision-free) hash function. This has been known for many years. The applicability is highly limited, because you don't usually have a static set of keys, and because the cost of generating the perfect hash is usually not worth it.

Gperf might be reasonable as a perfect hash generator for those incredibly rare situations when the extra work due to a hash collision is really the one thing standing between you and acceptable performance of your application.

I thought maybe we were seeing a bad writeup, but no, it's the authors' themselves who talk about the need for high-performance command-line processing, and give the performance of processing N arguments as O(N)*[N*O(1)]. I cannot conceive of a situation in which command-line processing is a bottleneck. And their use of O() notation is wrong (they are claiming O(N**2) -- which they really don't want to do, not least because it's wrong). O() notation shows how performance grows with input size. Unless they are worrying about thousands or millions of command-line arguments, O() notation in this context is just ludicrous.

I don't know why I'm going on at such length -- the extreme dumbness of this article just set me off.

Re:Wrong in so many ways by pclminion · 2007-07-29 05:23 · Score: 1

Gperf might be reasonable as a perfect hash generator for those incredibly rare situations when the extra work due to a hash collision is really the one thing standing between you and acceptable performance of your application.

The primary REAL use of gperf is generating keyword recognizers for language parsers. It's another tool in the same vein as lex and yacc.
Re:Wrong in so many ways by Anonymous Coward · 2007-07-29 06:17 · Score: 0

That is because you are too dumb to understand the article..the code becomes difficult to maintain with time as the number of if else comparision increases. The authors are specifically pointing to such scenarios where the numbers of options and their parsing increases. gperf is simply a way to generate the code dynamically. It is not the only way and it wont be too hard to write a similar function any way.
Re:Wrong in so many ways by Hydrogenoid · 2007-07-29 07:26 · Score: 1

O() notation shows how performance grows with input size.
Really?
I'd really like to see an algorithm whose performance grew with input size...
Re:Wrong in so many ways by Anonymous Coward · 2007-07-29 07:40 · Score: 0

The applicability is highly limited, because you don't usually have a static set of keys, and because the cost of generating the perfect hash is usually not worth it.

Agreed. Perfect hashes are a complete waste of time. You're generally better off using something like Tries or Judy arrays, which are "similar to a highly-optimised 256-ary trie data structure".

For a Trie, insert and lookup are both O(m), where "m" is the length of the input (independent of the number of elements in the Trie). Hint: That's as fast as possible (to within a constant).

One of the other advantages of a Trie is the fact that it can be enumerated alphabetically in an efficient manner (using ordered traversal of the underlying tree structure). Also, unless we're talking about millions or billions of options, the downsides to using a Trie won't really come into play.
Re:Wrong in so many ways by geophile · 2007-07-29 08:14 · Score: 1

No, I get that, my point is that a hash table with collisions is probably just fine. Another responder mentioned the use of perfect hashes to avoid collisions when looking for reserved words in a parse. That application makes sense, but again, I really wonder if perfect hashing is worth the trouble. Does it really provide a noticeable performance improvement over an out-of-the-box hash table?
Re:Wrong in so many ways by dkf · 2007-07-29 09:23 · Score: 1

Does [perfect hashing] really provide a noticeable performance improvement over an out-of-the-box hash table? Yes, but only if you can pre-compute the hash function and pre-size the table right. That's really quite hard to do; the effort involved is such that it is usually easier to not bother. But if you've got a program that's going to do billions of hash lookups and the keys are well-behaved, it can be a worthwhile optimization.

Strings (in English or any programming language) aren't generally well-behaved in the right sense though. Not unless your hash function is a crypto-hash, and that's typically quite a bit more expensive in other ways.

These days, my main criterion for a hash implementation is that someone else wrote it (and wrote it correctly). ;-)

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:Wrong in so many ways by tqbf · 2007-07-29 18:40 · Score: 2, Interesting

I challenge: cite as an example any fixed set of strings (such as would be applicable for perfect hashing) for which a realistic perfect hashing scheme of any sort outperforms a statically-sized conventional chaining table using a trivial 33/37-style string hash. I don't think you can. Gperf languishes in obscurity for a reason.
Re:Wrong in so many ways by tqbf · 2007-07-29 18:45 · Score: 1

Judy arrays are kind of silly, but I used to think tries were a great answer for parsing, because they provide O(m) abbreviation matching and access to ambiguous options. But then I realized: it's 1998 (hey, I'm old); why am I optimizing something that will run in individual milliseconds even if I search linearly?
Re:Wrong in so many ways by Anonymous Coward · 2007-07-29 21:21 · Score: 0

There's a yo momma joke in there somewhere, I can feel it...
Re:Wrong in so many ways by crucini · 2007-07-30 04:28 · Score: 1

Does it really provide a noticeable performance improvement over an out-of-the-box hash table?

I haven't looked at gperf yet, but an out-of-the-box hash table doesn't usually offer persistence. Meaning, the program would have to build the table at startup from a list of entries. If gperf allows the program to start with the hash table ready to use, that could be an improvement.

Historically? by ClosedSource · 2007-07-29 05:07 · Score: 3, Insightful

"Command-line processing is historically one of the most ignored areas in software development."

This is like saying that walking is historically one of the most ignored areas in human transportation.

Re:Historically? by alelade · 2007-07-30 07:18 · Score: 0

"This is like saying that walking is historically one of the most ignored areas in human transportation." Exactly, and if you ignore Command-line processing where needed, result will be calling a cab to go to bathroom.

MOD UP by ipjohnson · 2007-07-29 05:12 · Score: 0, Redundant

Wish I had some mod points, great reply.

is this a joke? by oohshiny · 2007-07-29 05:14 · Score: 2, Insightful

If it's not, the author of that article should be kept as far away from writing software as possible; he epitomizes the attitude that so frequently gets C++ programmers into trouble.

Re:is this a joke? by turgid · 2007-07-29 08:32 · Score: 2, Insightful

Well, what do you expect from IBM? It's just another one of their look-Ethel-it's-open-source-and-look-at-us-helping -the-community content-free PR fluff pieces. Ignore them and they'll crawl back into their mainframe cave.

--
Stick Men

devkitARM by tepples · 2007-07-29 05:17 · Score: 3, Informative

And you do so using MinGW and c++? Yes, I do so with devkitARM (a cross-compiling GCC toolchain that is itself compiled with MinGW) and C++.

Re:devkitARM by sholden · 2007-07-29 05:44 · Score: 1

What the toolkit is compiled with is irrelevant. You're not using it unless you are compiling code targeted to MS Windows, which I don't think you are. Doing the iostream versus stdio hello world on local gcc gives a difference of 496 bytes hence my guess that the way MinGW links libraries might be the reason for the bloat. And since MinGW targets win32, bloat is simply not an issue.

gcc: the prefect candidate? by e9th · 2007-07-29 05:17 · Score: 1

Tons of options, but what do we see? Only stuff like

else if (! strncmp (argv[i], "-print-file-name=", 17))

Maybe they're just too scared of its present options processing to change it.

Re:C++ I get by StripedCow · 2007-07-29 05:18 · Score: 1

C is indeed not a good intermediate language for the reasons you mentioned.
But C-- may be (http://cminusminus.org/)
Perhaps the kernel developers should be coding in *that* language :-)

--
If Pandora's box is destined to be opened, *I* want to be the one to open it.

Re:C++ I get by DirtySouthAfrican · 2007-07-29 05:20 · Score: 1

I don't think C++ compilers compile to C anymore... I know Borland's TPC did this, but that was back when C++ was built on top of C.

Is this a fucking joke? by pclminion · 2007-07-29 05:21 · Score: 2, Funny

Where's the Foot icon? Optimizing command line parsing? Oh God, my sides are splitting.

Re:Is this a fucking joke? by moosesocks · 2007-07-29 11:14 · Score: 2, Insightful

The weird bit is that, despite being a somewhat silly article, it launched one of the most intelligent discussions I've seen on /. in a while.

--
-- If you try to fail and succeed, which have you done? - Uli's moose

This is ridiculous by Bluesman · 2007-07-29 05:29 · Score: 1

First of all, how many programs have command line parsing as a bottleneck?

Secondly, they should put this functionality into GCC instead, so that it creates a perfect hash for any large switch statement.

--
If moderation could change anything, it would be illegal.

Re:This is ridiculous by Anonymous Coward · 2007-07-29 06:04 · Score: 0

Switch statements use integer keys, you don't need a hash table. You can directly index into a jump table (assuming the indexes are reasonably compact; if your only two cases are 0, a million, and a billion, obviously a compiler would rather use an if-else statements). It's very low overhead, which was the whole point of including a switch statement to begin with. :)
Re:This is ridiculous by larry+bagina · 2007-07-29 06:05 · Score: 1

gperf is concerned with string hashing. c switch statements use integers. All modern c compilers (even gcc) look at the case density and build an indirection table or set of if/else/else branches. (Or sometimes both).

--
Do you even lift?
These aren't the 'roids you're looking for.
Re:This is ridiculous by TehZorroness · 2007-07-29 16:58 · Score: 1

You cant put strings into a switch statement. What would you be hashing?
Re:This is ridiculous by oudzeeman · 2007-07-30 00:38 · Score: 1

command line parsing may not be a bottlenect, but if you have to call your program a few thousand times in a script or something this can add up after a while. I mean we could be talking about one or two minutes saved!!
Re:This is ridiculous by Anonymous Coward · 2007-07-30 02:54 · Score: 0

gperf is concerned with string hashing. c switch statements use integers. All modern c compilers (even gcc) look at the case density and build an indirection table or set of if/else/else branches. (Or sometimes both).

Duh! I think the parent realizes that switch is limited to integers.. clearly though.. it would be practical if it could use strings as well and use O(1) lookup to find the appropriate case.

However, doing so in C/C++ is kind of odd considering they don't have native string support (Strings are not considered a primitive type).
Re:This is ridiculous by crucini · 2007-07-30 04:36 · Score: 1

You can hash integers (or, of course, pointers). Example: you have a "tree" of nodes which may contain cycles. Write a function bool has_cycle(node *root) that determines if the "tree" has at least one cycle. You cannot modify the node structure.

Can your function run in O(n) time, where n is the number of nodes?
Re:This is ridiculous by multipartmixed · 2007-08-02 07:19 · Score: 1

Um, You CAN put strings into a C switch statement. You know, if you really wanted to badly enough.

You could be all like, hey look how leet MY program is!
switch((char *)strtoll(optarg + 2, NULL, 16)) { case "this string": do(this); break; case "that string": do(that); break; }
And then you'd be all like

# ./myprogram -i 0xff3323d -j 0x4733773f

I r0x0r d00d, eye yam s0 1337!!!1!11!!

The only problem with that is that every time you rebuilt the program, you'd have to update the manual with new pointer addresses.

--

Do daemons dream of electric sleep()?

Re:It is if the linker complains about not finding by tqbf · 2007-07-29 05:35 · Score: 2, Interesting

Are you seriously trying to argue that gperf is more portable than getopt?

Re:C++ I get by Anonymous Coward · 2007-07-29 05:39 · Score: 0

At least the Comeau C++ compiler still generates C code, and is known as one of the most portable and standard-compliant C++ compilers (including support for exported templates!). So compiling C++ to C is definitely a viable strategy (although I can understand compiler vendors that want to offer a complete toolchain take a different approach).

Re:It is if the linker complains about not finding by tepples · 2007-07-29 05:45 · Score: 1

Are you seriously trying to argue that gperf is more portable than getopt? I'm not arguing specifically in favor of gperf, but arguing generally that reinventing the standard library has its justifications at times.

Re:C++ I get by Goalie_Ca · 2007-07-29 05:52 · Score: 0

It only uses a 'subset' of c++ called c ;)

--

----
Go canucks, habs, and sens!

Re:C++ I get by PerlDudeXL · 2007-07-29 06:03 · Score: 1

You, whenever you compile C++ code, as it is compiled to C before machine code

One of my Computer Science Profs said something similar. He argued that C and C++
are basically the same outdated shit and professionals would only use Java in real-world
applications. The best thing: He ran Ubuntu and all sorts of Gnome stuff on his Laptop.

Byte counts when compiled with devkitARM by tepples · 2007-07-29 06:10 · Score: 1

What the toolkit is compiled with is irrelevant. You're not using it unless you are compiling code targeted to MS Windows, which I don't think you are. I knew that. But I have generally seen overheads of the same magnitude when using standard C++ libraries on devkitARM as on MinGW. I just tried it on the GBA: 5,156 bytes for hello-world.mb, which just pushes a C string straight into agbtty_puts(), and 253,652 bytes for hello++.mb, which pushes output through a std::ostringstream and then into agbtty_puts(). (The limit for a .mb executable is 262,144 bytes, as the other 128 KiB of RAM in the system is specialized.)

Doing the iostream versus stdio hello world on local gcc gives a difference of 496 bytes What "local" platform are you talking about? Does it use a dynamically linked C++ standard library?

Re:Byte counts when compiled with devkitARM by sholden · 2007-07-29 09:15 · Score: 1

My local ARM NAS box running linux, of course it uses dynamic linking, I'm not a sadist. If I statically link the executable size for the iostreams versions is double the size of the stdio version.

C++ libraries are big I'd assume if you wanted to use them in a low-RAM environment you'd write/buy/steal/download space efficient implementations (if such a thing exists, templates are embedded pretty deep and they bloat the binary).
Re:Byte counts when compiled with devkitARM by pyrrhonist · 2007-07-29 10:20 · Score: 1

My local ARM NAS box running linux
Are you running a, "slug", or some other box?

--
Show me on the doll where his noodly appendage touched you.
Re:Byte counts when compiled with devkitARM by andreyw · 2007-07-29 11:52 · Score: 1

Calling your Linksys slug a "NAS box" is probably pushing it juuuust a bit.

Plus it runs on MIPS.
Re:Byte counts when compiled with devkitARM by sholden · 2007-07-29 16:04 · Score: 3, Funny

It's not pushing it all. It's storage, it's network attached, it's in a box... What I am pushing is the poor little linksys device. It's plugged into 4 USB hard drives (plus a thumb drive, but that's just for booting) which it's running software RAID5 on. Poor little thing, if it could scream I'm sure it would be. Sadly it's the only machine with a C++ compiler on it at home these days...

Please don't tell the poor thing it's running on MIPS, the ARMv5TE kernel might just freak out and collapse the universe.

wow, how pointless is that by Anonymous Coward · 2007-07-29 06:13 · Score: 1, Insightful

I've probably used more time typing this message than every program I've ever run has used parsing command line arguments.

Re:It is if the linker complains about not finding by larry+bagina · 2007-07-29 06:21 · Score: 1

Here's something you've all been waiting for: the AT&T public domain source for getopt(3). It is the code which was given out at the 1985 UNIFORUM conference in Dallas. I obtained it by electronic mail directly from AT&T. The people there assure me that it is indeed in the public domain.

There is no manual page. That is because the one they gave out at UNIFORUM was slightly different from the current System V Release 2 manual page. The difference apparently involved a note about the famous rules 5 and 6, recommending using white space between an option and its first argument, and not grouping options that have arguments. Getopt itself is currently lenient about both of these things White space is allowed, but not mandatory, and the last option in a group can have an argument. That particular version of the man page evidently has no official existence, and my source at AT&T did not send a copy. The current SVR2 man page reflects the actual behavor of this getopt. However, I am not about to post a copy of anything licensed by AT&T.

I will submit this source to Berkeley as a bug fix.

I, personally, make no claims or guarantees of any kind about the following source. I did compile it to get some confidence that it arrived whole, but beyond that you're on your own.

--
Do you even lift?

These aren't the 'roids you're looking for.

I disagree by www.sorehands.com · 2007-07-29 06:23 · Score: 1

While I agree that most modern compilers can out optimize the average programmer, you are still looking at generalities.

Both compilers and abstract container class have to deal with generalities which may not apply to YOUR specific case. The class writer does not know the specific case or conditions (presuming you are not writing the class for that specific condition). A class writer has to (or should be) check arguments and conditions, where if you know it has been checked (and am damn well sure) you can skip that.

When writing an abstration layer, you are adding a layer.

On the other hand, a good programmer would not try to optimize a bubble sort. I was working on a resource compiler (in DOS) back in 1989. It would take 45 minutes to 'compile'. I rewrote it to take about 3:15 minutes. But during the writing my AVL btree insert was taking forever. A would allocate the memory to do the insert, when it found the word was in there, it would free it. Deferring the allocation fixed that.

If you know the entire program/system you can better optimize.

--
Fight Spammers!

gperf for options?!?!?! by Anonymous Coward · 2007-07-29 06:26 · Score: 0

?!?!?\N{INTERROBANG}

Talk about overkill. gperf is for parsers. Yeah, I know getopt is itself a parser, but I think anyone who's done real programming knows what I mean.

Besides, perfect hashes are old and busted. Cuckoo hashes give you almost identical performance and are far more flexible.

Re:gperf for options?!?!?! by obarel · 2007-07-29 08:40 · Score: 1

Thank you!

I am about to write a tiny XML parser for an embedded system (why XML? I didn't write the spec...). I know all available element and attribute names, and I just need to find them very quickly.

gperf was the first thing I was thinking of, but looks like Cuckoo hashes is going to be my choice now.

No idea why a score of 0 with such a contribution, but thanks anyway.

Another approach - parseargs by argent · 2007-07-29 06:29 · Score: 2, Interesting

Something Eric Allman wrote many moons ago. I found it and modified it to support "native" command line syntax on MS-DOS, VMS, and AmigaDOS, and added some support for improved self-documentation... and then Brad Appleton saw it and rapidly enhanced it to support a plethora of shells and interfaces until it took up 10 posts in comp.sources.misc.

The following two directories should bring it up to the latest version I know of.

This is not efficient, mind you. Command line parsing doesn't generally need to be efficient, even by my miserly standards, honed when a PDP-11 was something you hoped to upgrade to... some day...

ftp://ftp.uu.net/usenet/comp.sources.misc/volume29 /parseargs/
ftp://ftp.uu.net/usenet/comp.sources.misc/volume30 /parseargs/

PARSEARGS extracted from Eric Allman's NIFTY UTILITY LIBRARY Created by Eric P. Allman <eric@Berkeley.EDU> Modified by Peter da Silva <peter@Ferranti.COM> Modified and Rewritten by Brad Appleton <brad@SSD.CSD.Harris.COM>

Brad's latest work in this area seems to be here:

http://www.cmcrossroads.com/bradapp/ftp/src/libs/C ++/CmdLine.html

http://www.cmcrossroads.com/bradapp/ftp/src/libs/C ++/Options.html

Re:It is if the linker complains about not finding by Anonymous Coward · 2007-07-29 06:35 · Score: 0

on W*nd?ws systems

Watchawackabindows?
Wackinteluntilandows?
Winapackatindows?
Please! Expand that wildstar! Whatever could it match?

only relevent to static linking by sentientbrendan · 2007-07-29 06:36 · Score: 4, Informative

It sounds like the author is statically linking his library and running on embedded an embedded system. It is not surprising in that case that the c++ standard library brings in much more code than the c standard library, but it should be made clear that it is not relevant to desktop developers, pretty much all of which dynamically link with glibc.

Again, to be clear, dynamically linking with the c++ standard library is not going to increase your executable size. Please don't try to roll your own code that exists in the standard library. It is a real nuisance when people do that.

I should qualify that by saying that template instantiations do (of course) increase executable size, but that they do so no more than if you had rolled your own.

Re:only relevent to static linking by joto · 2007-07-30 06:41 · Score: 0, Offtopic

In exactly which way does templates NOT increase executable size?
The C++ standard library consists almost entirely out of header files, that must be instantiated before they can be used. Just because you are used to assume that iostreams deals with char's, doesn't mean that this isn't hidden behind umpteen layers of template hell in the C++ header files. And given that gcc doesn't do explicit template instantiation very well (at least the last time I bothered to check), I'll bet that these instantiations does not exist in any dynamically linked library, but must exist separately for every single program written in C++.

Re:C++ I get by pclminion · 2007-07-29 06:46 · Score: 1

There are features in the C++ standard that are so extremely difficult to correctly implement in standard compliant C that it's a complete waste of effort trying to pass via C while compiling.

The only thing I can imagine that would be hard to map directly onto C would be exceptions. Can you confirm that this is what you mean? Because nothing else comes to mind that would be "extremely difficult" to implement.

Even then, it's possible to emulate C++-style exceptions in C. I've done it -- the best description I can think of is "horrifically ugly." But it's possible.

Don't do any embedded development, do ya? by Anonymous+Meoward · 2007-07-29 06:49 · Score: 1

In the embedded realm (not to mention kernel or driver space stuff for any OS), you won't be using much C++. Granted, I've used both in the embedded world, and I prefer C++ whenever I can get away with it. But that ain't often.

One of the problems with C++ in the embedded market is not the language itself, but the mindset of the developers. Most folks who do low-level stuff are not as concerned with code structure and organization as they are the size and speed of the generated code. (Don't believe that? Try working under a tight schedule.) Many of them abhor C++ for its complexity, and more than a few in my experience also don't have enough experience with C++ to use it effectively anyway.

For example, when I worked on a platform that had to be up 24/7 (this wasn't something you'd buy from Best Buy, 'kay?), some enterprising soul tried his hand at C++ and put the following statement in a constructor:

delete this;

Brrr.

Not much C++ occurred in the organization after that one sneaked in.

--
--- The American Way of Life is not a birthright. Hell, it's not even sustainable.

Silly by m.dillon · 2007-07-29 06:54 · Score: 1

This is kinda silly. If you only have a few keywords you don't need anything sophisticated. If you have more then a few but not more then a few dozen its usually easiest just to arrange them in a linear array and do an index lookup based on the first character to find the starting point for your scan. More then that and you will want to hash them or arrange them in some sort of topology such as a red-black tree.

Generally speaking hashes are very cpu and cache-inefficient beasts, especially if one can reap the benefit of the locality of reference you get with other schemes. Hashes are easy to implement, though, so if you have a lot of keywords and there is either no locality of reference anyway or you don't care about the performance, a hash works just fine.

Insofar as strings go, once you get beyond a certain point its easiest to just hash the string on the front-end, deal with any collisions on the front-end as well (aka implement a string table and modify the hash value for one of the strings if a collision occurs), and then simply reference the string via its hash value in the remainder of the program instead of actually doing any further string comparisons. As an extention of this one can use a larger 64-bit hash and consider any collisions to be fatal. This is extremely viable for a language parser given that the chances of a collision actually occuring are so low you might only get one, or even zero, across the entire domain of source code in existence today.

If you have a fixed set of keywords, then a 16 or 32 bit hash is usually sufficient to avoid collisions. At this point you just generate a header file with the values and switch on them. e.g. hv = hash(str); switch(hv) { case KEYWORD_FOR: ... case ... }. This is equivalent to the use of some sort of data structure but it winds up being coded and optimized directly by the compiler, and it's very easy to understand the resulting source code.

-Matt

Re:Silly by IkeTo · 2007-07-29 20:10 · Score: 2, Interesting

> Generally speaking hashes are very cpu and cache-inefficient beasts

Um... why you think hashes are inefficient? In a lot of languages (Perl, Python, Javascript, etc) the standard collection is the hash. In Javascript, even a simple array is a hash! Why you think it is inefficient?

My thinking is that it is both CPU and cache efficient: it is CPU efficient because it usually just need one round of computation to get you to the correct result (as compared to a tree, which you need one round per tree level). It is cache efficient because you are usually not lead to somewhere irrelevant to your search (in contrast, any intermediate node in a tree when searching for an item in a binary tree will pollute your cache). Yes, in hash you have the hash table entries themselves which will pollute the cache, but that's not as much, exactly because of what you talk about: (spatial) locality of reference. In a hash all entries are in nearby memory, so it is likely that many searches in the same hash table will end up using very few cache lines. In contrast, in a search tree or a list, different nodes are allocated at different time and are much more likely to use completely different cache lines. At least this should be true until the time you overload it, but then you have extensible hashes.
Re:Silly by utnapistim · 2007-08-02 01:41 · Score: 1

>> Generally speaking hashes are very cpu and cache-inefficient beasts

Um... why you think hashes are inefficient? In a lot of languages (Perl, Python, Javascript, etc) the standard collection is the hash. In Javascript, even a simple array is a hash! Why you think it is inefficient?

Actually, the GP is right: hashing is can be an expensive operation, especially if you need collision-free hash values.

What you're referring to, is not hashing, but hash-bashed collections (hash-tables), where the indexing is done by hash-value. It's not the same thing: as far as algorithms go, hashing can be (but they're not always) expensive; as far as collections go, hashing is great for both read and write access. When you work with lots of data that's difficult to compare (xml or just strings usually) computing the hash only once and using it for comparing can indeed be a great optimization.

--
Tie two birds together: although they have four wings, they cannot fly. (The blind man)

And it's a gpl tool by Suicyco · 2007-07-29 07:14 · Score: 1

Which means that using at the command line is "linking" it. Doing so, of course, means your upstream code must be GPL as well. Ad Infinitum. Sorry, but the bulk of c/c++ code out there is non-gpl licensed and therefor can take no advantage of tools such as this.

Character encoding conversion by tepples · 2007-07-29 07:15 · Score: 2, Informative

How many of these embedded tools you write actually _do_ command line processing? None yet, but they do handle other things that involve dictionaries, such as character encoding conversion. A program designed to move items back and forth between a town in Animal Crossing (for Nintendo GameCube) and a town in Animal Crossing: Wild World (for Nintendo DS) needs to be able to understand the encodings of character names and town names that these games use, possibly by converting between their proprietary 8-bit codecs and UTF-8.

why don't you invest in more (both memory- and time-) efficient ways to do IPC than the command line? Because the command line, pipes, and sockets are the most obvious ways for two programs to communicate if their copyright licenses prohibit them from being linked together into one executable.

Re:C++ I get by mce · 2007-07-29 07:24 · Score: 3, Informative

Of course C++ exceptions are what I meant. What else would I mean when using the word "exceptions" in this context?

And yes, C++ exceptions can be expressed in C. After all, C is a glorified assembler and the resulting code from C++ translation is assembler as well. It all depends in the level of abstraction at which write the C code is written and on the amount of uglyness/inefficiency you're willing to take on board (and also the trade-off between both of the latter). But that's not the point. The point of this thread is that nowadays it makes no sense to make use of this capability in a C++ compiler. Especially not when considering that a user of a C++ compiler wants more than just a compiler. He also wants a debugger that is able to meaningfully link up the binary and the original C++ source. If you're a C++ compiler vendor, using C as an IL does nothing but complicate your own life. Twice.

--
Linux user since early January 1992.

Which platform uses dynamic libstdc++? by tepples · 2007-07-29 07:25 · Score: 2, Insightful

It is not surprising in that case that the c++ standard library brings in much more code than the c standard library, but it should be made clear that it is not relevant to desktop developers, pretty much all of which dynamically link with glibc. On MinGW, the port of GCC to Windows OS, my programs dynamically link with msvcrt, not glibc. Also on MinGW, libstdc++ is static, just like in the embedded toolchain. Are you implying that one of the C++ toolchains for Windows uses a dynamic libstdc++? Which toolchain for which operating system that is widely deployed on home desktop computers are you talking about?

Re:Which platform uses dynamic libstdc++? by Embedded2004 · 2007-07-30 04:46 · Score: 1

I think pretty much them all but MinGW.

GCC for Linux and MS's cl.exe both let you dynamically link.
Re:Which platform uses dynamic libstdc++? by stonecypher · 2007-07-31 02:07 · Score: 1

He's right, Tepples. Calm down. What you're seeing is a result of the way DKP handles embedded calls. Mute and I had this out in channel a few months ago; you can push a GBA binary using streams down to about 51k using newlib, down to about 14k using MSVS, and down to about 6.1k using GHOC. Once again, you've gone off on a single example and assumed it was a fault in C++ rather than in the GCC libraries that are all you seem to have any experience with.

--
StoneCypher is Full of BS

Re:It is if the linker complains about not finding by Waffle+Iron · 2007-07-29 07:28 · Score: 1

Well, if you're going to reinvent the wheel, you might a well do it compatibly. You can get a BSD-style licensed implementation of getopt and getopt_long that is portable to Windows. From the README:

WHY RE-INVENT THE WHEEL?
I re-implemented getopt, getopt_long, and getopt_long_only because there were noticable bugs in several versions of the GNU implementations, and because the GNU versions aren't always available on some systems (*BSD, for example.) Other systems don't include any sort of standard argument parser (Win32 with Microsoft tools, for example, has no getopt.)

Re:It is if the linker complains about not finding by tepples · 2007-07-29 07:36 · Score: 1

Well, if you're going to reinvent the wheel, you might a well do it compatibly. You can get a BSD-style licensed implementation of getopt and getopt_long that is portable to Windows. Thank you for this link. I've downloaded it, and I'll look at it when I get time.

Re:It is if the linker complains about not finding by ucblockhead · 2007-07-29 07:49 · Score: 2, Insightful

When faced with this issue, I simply wrote a Windows version of getopt. Took about a day.

Even when reinventing the wheel, it is important to reinvent as little as possible. If you need functionality that isn't there, at least keep the same interface.

--
The cake is a pie

Re:C++ I get by bytesex · 2007-07-29 08:26 · Score: 1

When you have 'goto', and 'return', what's so difficult about implementing exceptions in vanilla C ? Even in APIs - you just 'goto' some point that sets a flag and returns, and the 'trying' API-using caller checks the flag upon return. If the namespaces don't clash, it's not a problem at all !

--
Religion is what happens when nature strikes and groupthink goes wrong.

This tool is much easier by stupendou · 2007-07-29 08:35 · Score: 3, Interesting

Try supergetopt instead. Much easier to use and also open source.
http://www.ibiblio.org/pub/Linux/devel/sugerget-1. 1.tgz

With this code, you simply specify command-line strings and variables in a printf()
style format.

E.g. supergetopt( argc, argv,
"string1", "%d %d", function1,
"string2", "%s", function2 )

will call function1( int a, int b ) when string1 is on the command line,
and will call function2( char *s ) when string2 is used on the command line.

A whole lot easier than gperf, IMHO.

Boost.Program_Options? by nahpets77 · 2007-07-29 09:00 · Score: 2, Informative

What about Boost.Program_Options? I thought I'd see a post on it here somewhere, but not one person has mentioned it (yet).

A few months ago, I was looking around for a C++ library for parsing command line options. I checked out get_opt and I thought that there must be something that uses std::string instead of char*. After some googling, I found Boost.Program_Options seemed to be exactly what I was looking for. It supports long and short options (-s,--short) and I was able to start using it quite easily after looking at the tutorials.

Re:Boost.Program_Options? by abdulla · 2007-07-29 12:47 · Score: 1

Boost.Program_Options has some odd problems with the GCC visibility flags that cause it to return invalid values. However, after wrapping the header with visibility pragmas it works, and it works well with my needs. I needed a library that would allow me to specify a library on the command line, load that library and add possibly more command line options, then continue processing all other arguments. However PO is rather bloated compared to other options available, but at least it isn't leaking memory like Popt.
Re:Boost.Program_Options? by nahpets77 · 2007-07-29 14:09 · Score: 1

I haven't had any problems with wrong values being returned. Did you properly initialize your variables to default values?

("foo,f", po::value<int>(&foo)->default_value(10) , "some description here")

The main reason I settled on boost::program_options was that it was written in C++ for C++, unlike other libraries out there. An added bonus is that it can also accept configuration files via the parse_config_file function.
Re:Boost.Program_Options? by l3mr · 2007-07-30 00:44 · Score: 1

I fullyheartedly agree. Boost::program options is great, I use it for all my cmdline parsing.

--
The world always seems brighter when you've just made something that wasn't there before. - Neil Gaiman

Re:C++ I get by mce · 2007-07-29 09:58 · Score: 3, Informative

The main problem (but not the only one) is called "object destructors". You have to make sure they are called. All of them, and in the correct order, at all the nested scopes of execution you are in when the exception occurs. And you need to make sure not to call them on any object not yet constructed (always remember that constructors can throw exceptions too) and never to call a destructor twice (I've seen this kind of bug multiple times in multiple compilers). And then there is the fun of exceptions thrown by destructors, not to mention the possibility that it all happens in the middle of constructing or destructing an array of objects.

All that is why setjmp()/longjmp(), also known as C's non-local goto, don't cut it, which in turn means that you need to complicate function return mechanisms. And just when you think you got that problem sorted out, you need to be aware that C++ functions can call (library) C functions that were never compiled to even know about exceptions but that in turn can call C++ functions that may again throw an exception. The entire construction needs to be able to handle this.

As I wrote in an other post in this thread, it can be done. But it is not easy. Note that the entire object destructor issue also applies within a single scope, which is why life is not as easy as replacing every "throw" statement by "goto end;".

--
Linux user since early January 1992.

Re:C++ I get by DeKO · 2007-07-29 10:07 · Score: 0

2 words: constructors and destructors.

A constructor may throw an exception; partially-constructed objects may not be destructed. Not only you have destruct the objects in the reverse order, but have to figure out which objects to destruct. It's much more complex than plain return or goto.

Not only that, exceptions aren't supposed to slow down your program when you don't throw any; so to compile efficiently, you have to maintain some specialized data structures to avoid "checking for exceptions" for every function call.

Then there's the catch block: if you throw a Child object you can catch it as a Mother class; and C++ allows multiple-inheritance (thanks God^H^H^HStroustrup), so at every catch block you have to test (efficiently) if the types match.

So I don't think a compiler supporting just goto and return can acceptably implement exceptions.

Did you look at the "listing 1" sample by Anonymous Coward · 2007-07-29 10:57 · Score: 0

Did you look at the "listing 1" sample?
it doesn't make sense at all!

if (strtok(cmdstring, "+dumpdirectory")) { // code for printing help messages goes here } else if (strtok(cmdstring, "+dumpfile")) { // code for printing version info goes here }

strtok doesn't work that way, it takes a list of separators (like space and tabs) between the tokens,

Even if this was meant to be 'strcmp', the return value would be zero so it would have been if (!strcmp(...))

very bizzare, it's like the article wasn't reviewed at all

Re:C++ I get by Sancho · 2007-07-29 11:58 · Score: 1

Interestingly, C is no longer a proper subset of C++.

Re:It is if the linker complains about not finding by VGPowerlord · 2007-07-29 14:23 · Score: 1

Are you seriously trying to use such a blatant and obvious strawman?

--
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011

Re:It is if the linker complains about not finding by tqbf · 2007-07-29 14:48 · Score: 2, Informative

Absolutely. There is no platform for which gperf is a better, more portable option for command line processing than getopt. I'm not sure what you think getopt does that is "tricky" under Win32. Its a string processor.

Re:It is if the linker complains about not finding by __aawavt7683 · 2007-07-29 14:57 · Score: 2, Informative

When faced with the issue of implementing getopt on Windows, I merely took the code from FreeBSD: src/lib/libc/stdlib/getopt.c

I love FreeBSD. (I once changed the motherboard, rebooted, went, "Oh.. shit," and proceeded to login. All drivers are compiled as modules, in less time than my lean linux kernel. :-/)

I sidestepped the license issue, stripped out extraneous header files, changed a couple referenced to _getprogname() (either to static string "" or to a global var, as it is in libc), read the man page to figure out how to use it and had a short-form option parser in.. probably under an hour.

Some things you have to code. For everything else, the Regents of the University of California has done it for you.

My experience with command line parsing by Anonymous Coward · 2007-07-29 17:34 · Score: 0

Years ago I spent a few minutes and 30 lines of code later I had a command line parser that placed arguments into an associative array. Its used in roughly 90% of our stuff nowadays and saved a little bit of time although we thankfully don't have much to do /w CLI.

The central problem is command line arguments can sometimes be very very complex and very hard to account for in a systematic manner mostly because the argument lists, consumption of parameters etc may very well depend on what those parameters actually are meaning your better off keeping a global arg consumption counter with that big switch statement than going off trying to find some system to abstract what will be more complex to abstract anyway.

Or could the real problem be the complex argument lists? I guess it depends on the specifics of the system but from my personal experience I don't think its fair to assume people ignore this sort of thing completely.

Also Efficiency and one time parsing of an arg list... WTF!?!? What year is this, how many GHZ do CPUs run and we're talking about efficiency of CLI parsing. I think I'll shake my head in disbelife and move on :)

Re:It is if the linker complains about not finding by VGPowerlord · 2007-07-29 18:21 · Score: 1

Right, but the great-great-grandparent never said gperf is more portable than getopt. What he did say was that getopt() doesn't work on Windows and that it only handles short arguments. From that, you drew a conclusion not supported by the facts in evidence (to put it in lawyer speak).

--
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011

Re:It is if the linker complains about not finding by tqbf · 2007-07-29 18:30 · Score: 2, Informative

Again, on the off chance that this helps anyone reading this pitifully long and silly thread: it is trivial to make getopt work on Win32, just like it was trivial to make strsep work on Linux when it only had strtok. I object to the argument that "portability" has anything whatsoever to do with whether you'd use getopt to parse arguments.

Like most of the other comments on this post, I find the idea of using gperf for "high performance argument parsing" superfluous and convoluted. In fact, I find the idea of a general-purpose perfect hash tool a bit superfluous as well; gperf languishes in obscurity for a reason.

Re:C++ I get by bytesex · 2007-07-29 19:50 · Score: 1

Ok, but at that point you have already called their constructors, so you have a list. Just working down that same list calling all the destructors doesn't seem that much of a problem.

--
Religion is what happens when nature strikes and groupthink goes wrong.

PRM by www.sorehands.com · 2007-07-29 19:54 · Score: 1

I still use it. I have not been able to find an e-mail program that will run on OS/2 that allows me to attach notes to e-mails w/o effecting the original e-mail. The PRM stickynote feature.

It does cause fits for opposing counsel when I provide the files in OS/2 format.

--
Fight Spammers!

c+/C++ try powershell by Anonymous Coward · 2007-07-29 20:45 · Score: 0

powershell anything can be done in MS latest server products.
The power is exactly what most here addres as a problem with arguments.
Answer tot that problem is make everything as object orientated.
this is the last devolpment and has the goods of unix and MS combined.
The creater actualy worked once for unix, but improved shortcomming in an new language.
it's even more powerfull then what we could do with vbs (integrate for example excel and file/user management etc).
it can now pass whole object to change... like a bit having a wmi base command in a oneliner. (altough not limited to oneliners).

Well in short the problem isn't the language it's about what you can do with it in a server environment.

Yup, and it's called.... by DrYak · 2007-07-29 21:12 · Score: 1

imagine C++, Ruby, and LISP coexisting peacefully without glue

Yup, and it's called Parrot, and it's currently being developed. (There's also the Java VM which has some undocumented cross-language support. It was designed for Java but you can compile other language with special compilers to byte code and run them successgully). (There's also the .NET virtual machine, which is officially touted as multi-language by it's marketing department and is soso successful at it, if you don't mind half of the crazy features of your favorite language being unsupported and the other half being implemented as 'lanugage-interpretter-running-as-bytecode-itself- inside-.NET's-VM').

The only question is : what language will THAT VM be written into ?
That's the trick : most of the high level languages like Perl, etc. are designed to be run on VMs, and most of the time produce bytecode (for exemple, there's still no 100% sure way to compile perl into machine code as of today). Not all of them can output something that can be run on the CPU.

So for now, all VMs are themselves produced in C.

Maybe in embed market you will see processors with specialised unit for VM-bytecode (as it is currently available with Java on some feature phones), but won't see the x86-legacy architecture being displaced by it any time soon. It's very hard to change such legacy as the IA64 vs. AMD64 has proven.

I also dread the day when the whole GUI itself is running inside the VM like the ass ugly and horribly slow Java's. Highlevel language are nice for glue-coding, for putting together other technologies (from libraries) together into an application, but GTK itself or some computing library into a JVM would be hell.

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]

Re:Yup, and it's called.... by TehZorroness · 2007-07-30 04:22 · Score: 1

That's the gap that no one has crossed. It shouldn't be the job of a VM or special environment, but the operating system itself. While UNIX-like systems are very nice compared to some of the alternatives, I don't think it is the only method of solving the problem at hand, which is allowing users to make the computer do what it is they want it to do. If you have a somewhat intelligent user, they'll point out that it is much more flexible to work with a full language like ruby then it is to work with a shell. When you are programming, you are provided an API you can use, while when you are just meerly using the system, you have a set of programs you can run and that's it. Even if you don't need glue to call upon language B from language A, you still need glue, the programs themselves, to use the functionality from the shell.

Instead of going the UNIX approach, if we look at it from the Lisp machine approach, all of the functionality _is_ available at the user's fingertips, you don't need a glue program to make a really low-level system call or anything like that. There are no boundaries. At the same time, imagine if making a notification show up in the console of your favorite shooter for ever tyone someone talks to you in IRC only took 2 or 3 lines in the OS's native shell.

Take a look at the TUNES project. This is exactly what I have in mind. (I've read some of their design documents after planning out my own "dream OS" and the designs coincided perfectly. no joke)

Re:C++ I get by Anonymous Coward · 2007-07-30 00:13 · Score: 0

Or perhaps you mean that C++ is no longer a proper superset of C, considering the history behind the two.

GNU tools by Chemisor · 2007-07-30 01:21 · Score: 1

> Unless they are worrying about thousands or millions of command-line arguments

Clearly they are thinking of the next version of ls...

std::map by Hythlodaeus · 2007-07-30 01:36 · Score: 1

I find std::map to be quite sufficient for this purpose.

--
For great justice.

SimpleOpt by jayloden · 2007-07-30 02:38 · Score: 1

Kind of surprised no one has mentioned SimpleOpt yet. It's public domain, lightweight, self-contained, and works great. I've been using it for quite a while in a C++ project and I've been very happy with it. It's also designed explicitly to be cross-platform and not depend on platform-specific features, whereas e.g. getopt would not necessarily be available on a given platform.

Re:It is if the linker complains about not finding by Ocrad · 2007-07-30 03:27 · Score: 1

getopt() is in the header <unistd.h>, which is in POSIX, not ANSI. POSIX facilities are not guaranteed to be present on W*nd?ws systems. It also handles only short options, not long options. For those, you have to use getopt_long() of <getopt.h>, which isn't even in POSIX. You can also use Arg_parser and avoid all the portability problems of getopt_long.

Re:C++ I get by David+Greene · 2007-07-30 03:30 · Score: 1

One of my Computer Science Profs said something similar. He argued that C and C++ are basically the same outdated shit and professionals would only use Java in real-world applications.

Then your professor is a fool. When Java can do metaprogramming, we can talk. It's an invaluable tool for trading off run-time flexibility for speed. Most polymorphism is in fact completely static.

--

CLI? by Anonymous Coward · 2007-07-30 05:33 · Score: 0

Who uses the Command Line Interface (CLI) anymore. It seems with each release of Windows, we are raising a generation of useless "Pointers and Clickers' who are prone to repetitive-strain-injury RSI.

Want to change the extension of 100 files in windows, good luck pointing and clicking.
Want to add 100 users in Windows, and set default/random passwords, good luck pointing and clicking.

Heck, with XP home, just point and click your username.

Reducing footprint of static libstdc++ in GCC? by tepples · 2007-07-31 02:54 · Score: 1

What you're seeing is a result of the way DKP handles embedded calls. Mute and I had this out in channel a few months ago; you can push a GBA binary using streams down to about 51k using newlib How would I go about this? Do you still have the IRC log so that I can read through it and try it myself?

and down to about 6.1k using GHOC. And what is GHOC? Google ghoc c++ turns up a whole bunch of references that either aren't in English (mostly Vietnamese) or aren't about programming (hedge funds, croquet).

Re:Reducing footprint of static libstdc++ in GCC? by stonecypher · 2007-07-31 06:51 · Score: 1

How would I go about this?
RTFM.
And what is GHOC?
The Green Hills Optimizing Compiler. Sorry: I expected its google rank to be higher than it apparently is.

--
StoneCypher is Full of BS

Where is the Alpha? by Anonymous Coward · 2007-07-31 05:54 · Score: 0

I am the Alpha Troll. Vengeance is mine, sayeth the Alpha Troll; I will pre-pay!

Re:C++ I get by Anonymous Coward · 2007-07-31 11:25 · Score: 0

C is C. Assembler is not C. C is not Assembler.

Which manual? by tepples · 2007-07-31 11:29 · Score: 1

How would I go about this? RTFM.

Which section of which manual details how to reduce the binary footprint of <iostream> in static libstdc++? I read the libstdc++ FAQ, but it just says that you can replace g++ with gcc -lsupc++ if you need new and delete but not <iostream>. It states that splitting up parts of libstdc++ into individual sections is currently not implemented due to implementation defects in GNU ld's garbage collection; is there a good way to work around this?

Or which query in which Web search engine should I use? I tried iostream binary footprint in Google and iostream code size in Google, but I didn't see anything relevant in the first pages.

And what is GHOC? The Green Hills Optimizing Compiler. Thank you. But why are Green Hills products sold on a "call for price" basis?

Re:Which manual? by stonecypher · 2007-08-02 01:20 · Score: 1

Or which query in which Web search engine should I use?
Stop asking me idiotic questions, and spend more than three minutes researching. You have this tremendously ugly habit of insisting something that's relatively simple isn't true, then nagging people over and over after they've already made it clear that they're not going to look it up for you to please look it up for you.

I told you RTFM last time. That is not an invitation for you to ask me what chapter, what query, what page, any of it. I figured out how to drop the size that far in under 15 minutes. Do something without help, for once in your life, you lazy sack.
Thank you. But why are Green Hills products sold on a "call for price" basis?
What am I, their sales team? Call them and fucking ask. Jesus.

--
StoneCypher is Full of BS

Re:C++ I get by Anonymous Coward · 2007-07-31 21:49 · Score: 0

Your professor sounds exactly like a student.

Slashdot Mirror

Don't Overlook Efficient C/C++ Cmd Line Processing

219 comments