The Sacrifices of Portablility?
hackwrench asks: "There is lots of talk about writing portable programs, but this pursuit has resulted in a lot of processor features going unused. One example is being able to write a program that purposely uses a combination of 16-bit and 32 bit. I know there are arguments that writing solely in one or the other is a performance advantage, but what are the factors involved? Is the slowness of such a combination inherent in its design or is it a result of current hardware. We are beginning to replace systems and programs designed primarily to run in pure 32-bit mode with systems designed to run in pure 64-bit mode, so I ask: Is such purity really worth it?"
For instance, consider a video game. The faster it is the more likely it is that players will like it. But there are many more important factors including is the game just plain fun. So in video games, there is really a basic threshold of speed that needs to be met and after that is met, other factors are more important.
Next consider a real time system for trading stocks. This system is all about speed and reliability. You can control the deployment hardware and it is economically worthwhile to spent a lot in development if it makes more money in the long run. So coding your own memory pooler that uses the size of the pointer and a specific struct to make the code allocate and deallocate memory in constant time (it is very possible) is worthwhile because it can save alot of time per transaction.
But all of these issues come down to what exactly you are writing and both the technical and business requirements of your project. Without knowning those in advance, we can't really answer your question.
"Those that start by burning books, will end by burning men."
It used to be that computers were expensive and people were relatively cheap. Nowadays, the reverse is generally true.
So, unless these systems have performance critical portions, like high-speed digital signal processing where every FLOP counts, it really isn't worth the extra effort to optimize your code for the platform - you'll just end up having to hand-tweak (or even worse, un-tweak) it again on the next hardware upgrade.
When information is power, privacy is freedom.
This is the compiler's job. If your compiler targets a particular processor poorly, get a better compiler.
There is no such thing as portable code:
When most developers talk about portability they are talking about OS portability. The portable-to-other-processors debate has long since left the building largely due to incredible speed increases in processors. There's no reason, apart from esoteric algorithm tweaking, to code something in a processor specific manner.
Code porting to another OS is only an issue because operating systems and the hardware they run on are still changing at a dramatic pace. There is no standardized language that covers all the common aspects of a modern operating system, because they are aiming at a moving target. Even the ultra-portable Java has to be extended outside of the official specification to cover serial ports, complex sound, complex graphics, etc.
Portability hasn't been about processor speed for a very long time, and at this point it shouldn't be - a better compiler or a faster processor is a *ton* cheaper (time, money) than writing processor specific code in all but a few extraordinary cases.
-Adam
-David
There. Now go play some cool javascript games!
A couple of points about optimization.
1) Premature optimization is evil. Everybody says this, but so many people do not take it to heart. I'd rather have software that works, than software that is fast but crashes. As a programmer, its nice to work on non-buggy software, even if its not as fast as it could be.
2) Target-specific optimization is generally evil, unless you're sure your code will not live very long (eg: a game). The thing is that micro-optimizations generally tune for a particular processor, and actually pessimizes the code in the long run. In comparison, if you write good general code, it'll still be fast ten years from now when processors look very different.
3) The bottlenecks that people, especially C/C++ programmers worry about, are usually not the bottlenecks that usually matter. If you worry that your code could be faster/more memory efficient if you use a 16-bit field here or there instead of a 32-bit one, your algorithms better be absolutely perfect. Most code does not use perfect algorithms. That's why so much software is still so slow. Most programmers just don't get the time to use the best algorithms, much less get down to the level of micro-optimizations.
That's why I always find language performance debates entertaining. C/C++ programmers will freak out if you tell them language X is very productive, but is maybe two-thirds as fast as C (something that is true of a number of high-level, but compiled, languages). Meanwhile, they will write code that runs at maybe 1/3 of what the machine is capable of, because they spend so much time writing the code they have little time to optimize it.
A deep unwavering belief is a sure sign you're missing something...
Generally, the time you spent adding useless annotations to your source code would be better-spent with a pencil and paper trying to figure out a way to improve your algorithm. Compilers, generally, are good enough these days. Especially now that GCC is decent and runs on most of the interesting processors. The gains in performance, and this is is something that even the Linux kernel guys have realized, are going to come from good algorithms. This is especially true because of the recent multi-core phenomenon. More and more, "good code" is going to be code that implements good scalable algorithms. Lower complexity beats smaller constant factors any day of the week.
A deep unwavering belief is a sure sign you're missing something...
First, using 16-bit components of registers incurs a stall on most modern x86 CPUs. Remember, they are RISC processors underneath, which have no conception of partial GPRs. Second, RAM is dirt cheap, so let's not even consider blowing RAM. Things get interesting when talking about fitting things in cache, but the simple truth is that if your data doesn't fit in cache, the benefit from just halving its size is usually minimal. As soon as your data set grows, you've blown the cache again. You're almost always better-served trying to figure out how to get your code to operate on data in cache-sized chunks, so your performance stays constantly good, instead of being great with one data set and piss-poor with another.
A deep unwavering belief is a sure sign you're missing something...
I don't know. Then again, ten years ago, if you'd told me that an e-mail client or web browser would require tens of megabytes of memory just to load, or it would require over 100MB just to store the quick start-up code for an office application, I'd have laughed. Right now, that's exactly what Firefox, Thunderbird and OpenOffice 2.0 are claiming on the PC where I'm writing this.
Actually, I'm still laughing, because that says more than words about the design of those applications and the tools used to compile them. But the applications have expanded to fill the space nevertheless.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Consider, for example, passing a large bit of data as a parameter to a function. In languages that use pass-by-reference semantics, this will typically be cheap. In languages that use pass-by-value semantics, this will typically be expensive. In C++, you have a choice, but the natural (that is, default) is by value. Would you tell a C++ programmer not to use const-reference parameter types from the start, because it's a premature optimization?
I would tell a C++ programmer that worrying about a bit of extra data copy in the function call is generally useless. It's really not that expensive unless your structs are monstrously large. Generally the question you're interested in is semantics. Do the semantics lend themselves to a pass-by-value, or a pass-by-reference? If, after profiling, you find that this is a problem, use the passing style on those few functions that the profiler points out. Doing it for everything else is useless.
In some types of software, you simply have to plan for performance from the start.
Yes, but planning for performance from the start doesn't mean optimizing from the start. It means designing good algorithms and implementing them without any grossly stupid performance mistakes. Optimization can happen after implementation, where profiling shows the need for more hand-tuning.
Obviously algorithmic improvements make more difference than anything else, but even so, there's a scale between large-scale algorithm and data structure changes and assembly-level micro-optimisation, not a switch.
It's a scale, but one very biased towards high-level optimizations. Compilers do an excellent job of the low-level stuff. Even at the data structure level, you get a lot more benefit from considering things like ordering your access patterns for cache-friendliness than you do from saving a byte or two here or there.
A deep unwavering belief is a sure sign you're missing something...
I disagree here. Read the page you linked to again. The point is that you have to have a feel for the overall design of the program you are making and how that design will work in the end. It is not about how fast you can make memcpy() go (for example)--that can only get you so far. Take for example:
That is because in the context of C (which the discussion is about) the lengths of strings are not known (quickly). For large numbers of strings your algorithm is still orders of magnitude slower than keeping the strings sorted and doing a binary search. That was my real point, and Knuth's point. That optimizing your overall algorithm can yield vast improvements that hand optimizing little sections of code just cannot come close to. This is what the linked essay says that good programmers develop a feel for, not silly little tricks to speed up a single for loop. That's the kind of thing you do very last and only if you have an intensely speed critical application and you've already exhausted optimizing your algorithms--because you're only going to speed things up by small percentages.
If the reason you are talking about is some semblance of portability then you are right. Have you ever read the C rationale? It explains the reasoning of the decisions the C committee made and helps you see things from their point of view. It was very enlightening for me when I first read it. It apparently used to be part of the C standard but they broke it off into a separate document at some point.
-David
There. Now go play some cool javascript games!
If you have to test it on another architecture you are not writing portable code. Portability is less about specific architectures and more about "don't assume anything that might not be true on other architectures" like endianess, sizeof(int),...
Linux is not Windows
I understand the fact that you can at least prepare for portability. However, I would always want to run it through an alpha, beta (and maybe acceptance) environment before saying it'll work.
8 of 13 people found this answer helpful. Did you?