Slashdot Mirror


The Sacrifices of Portablility?

hackwrench asks: "There is lots of talk about writing portable programs, but this pursuit has resulted in a lot of processor features going unused. One example is being able to write a program that purposely uses a combination of 16-bit and 32 bit. I know there are arguments that writing solely in one or the other is a performance advantage, but what are the factors involved? Is the slowness of such a combination inherent in its design or is it a result of current hardware. We are beginning to replace systems and programs designed primarily to run in pure 32-bit mode with systems designed to run in pure 64-bit mode, so I ask: Is such purity really worth it?"

18 of 95 comments (clear)

  1. Compiler Optimizations by jaredmauch · · Score: 2, Informative

    Is the problem that the compiler optimizations are not producing the right outputs? or too much of the code is compiled with debug flags (ie: -g). I would expect the compiler to handle things, but i've found that I rarely have the desire to run the non-debug code as when things do go south it's rare and i'd rather have ease of solving the problem being available to me. There are some cases where I don't do this where performance matters, but that's rare in my experience.. People have done many studies of what compiler optimizes things better, eg: gcc vs intel compiler. gcc vs sun compiler. Generally the one written by the vendor does a slightly better job.

  2. The industry by Device666 · · Score: 2, Interesting

    The hardware / software industry (generally speaking) doesn't care about quality, as long as they are so busy competing with eachother in a high pace. Because companies are competing they will seek some features others haven't and most fo the time the relevancy of those features is very small, especially if the company has become very big (exceptions of course are always there). The crowd isn't very picky either, though it is clear open source has put value to the development of portable software. People buy a amd64 (not only because of the price, ofcourse some geeks also for the 64 bit feature) but the majority runs still 32 bit binary windows software on it. So most people don't care so much I would say. They just buy something because it's cheap, to play games on and do things like patience, free cell and the basic things like word etc. There is so much technology to know about that at some point people don't care anymore. The next day there will always be newer and faster hardware.. Even when you only wanted to make a simple document. But then you have to use Vista (for some) so then you would need a faster computer.. Portabillity is only useful for people who don't want to keep buying software and are fed up with it. A very few of them make their hands dirty and migrate to open/free source software or start to write alternatives themselves. To those people these thing really matter. They want to make something durable and it simply takes time for software to mature. Portabillity of code really matters if there are more open source/ free software users and developers. Then people will experience the benefits of portable code.

    1. Re:The industry by __david__ · · Score: 2, Insightful
      Portabillity is only useful for people who don't want to keep buying software and are fed up with it.
      No, portability is more useful to those writing software that has to run in 2 (or more) environments. Say I want to write a game that runs on the xbox and the ps2. The more portable I make my code, the happier I will be in the long run (and the cheaper the price will be for the port to whichever platform comes second).

      -David
  3. 16 bit is often slower than 32 bit by dtfinch · · Score: 4, Informative

    In 32 bit protected mode, 16 bit instructions require a prefix to tell it that the following instruction is 16 bit, wasting a byte and a CPU cycle. In 16 bit real mode, the same is true of 32 bit instructions. But modern processors aren't optimized to preserve 16 bit performance. If they can improve 32 bit performance just a little, they'd be willing to sacrifice a lot of 16 bit performance to do it. Also, if you're mixing 16 and 32 bit variables in C/C++, it'll do a lot of expensive conversions to make it all work. I've done very little with 64 bit though, aside from playing with MMX on one occasion.

  4. Ideally, your code is clean enough by Frumious+Wombat · · Score: 4, Interesting

    that this transition isn't all that painful.

    My personal experience with this was Linux on Alpha, where certain programs assumed a 32-bit environment, rather than querying the system they were built on for size of int, pointer, etc. As a result many programs were funky on the Alpha, and the 'pc-isms' (what we once would have called Vaxocentrisms) caused great waste of time as they had to be tracked down an eliminated.

    Your code, if you've been worrying about anything other than 32-bit PCs, should already be 64-bit clean, as you've had 15 years of Alpha, SGI, Power, Itanium, and Sun 64-bit systems to support. If it isn't, hopefully it's something such as user interface which will still run in the 32-bit environment, though not necessarily optimally.

    Personally, I think that writing robust, portable, code is worth the effort. Unless you're talking about running on an embedded system where every byte counts, it doesn't hurt you at all to design clean algorithms and data structures, and put in checks to actually determine the size of ints, longs, pointers, etc, rather than just assuming that everyone will run x86 (or MIPS-64 or whatever) from now until the end of time. I have research programs that were written in the 70s (in their original form), on Cyber 205 and similar long-gone architectures, which still work because they were written in a mostly portable manner, with only the most critical nasty bits tied specifically to that machine. Your code is going to be in use longer than you think; be nice to your successors and make it portable now.

    --
    the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
    1. Re:Ideally, your code is clean enough by Taladar · · Score: 3, Insightful

      If you have to test it on another architecture you are not writing portable code. Portability is less about specific architectures and more about "don't assume anything that might not be true on other architectures" like endianess, sizeof(int),...

  5. Detailed Reponse to Cliff and HackWrench by woolio · · Score: 2, Interesting

    Does "hackwrench" even know how to program? Does he know anything about Computer Architecture? "Hennesy" or "Patterson" ring a bell? Sounds like "Cliff" likes to feed trolls. Maybe "hackwrench" will choke while digesting this one:

    What is the inherent "slowness" of "16 bit code" WTF is "16 bit code" anyway? Sounds like has been duped by the marketing droids...

    So-called "32-bit" processors are typically designed to perform (up to) 32-bit arithmetic efficiently. For integer operations, 8bit, 16bit and 32bit arithemetic usually each take the same amount of time (8bit add = 16bit add != 16 bit multiply) .

    Because "32-bit" processors can do "32-bit" arithemtic efficiently, it makes sense for them to use (up to) 32 bits for addressing. Arithmetic involving addresses comes up more often than you would think... (Branch/Jump instructions, memory operations, and even the basic updating of the program counter). Since these processors data paths are (typically) 32-bits wide, instructions are typically coded using up to 32-bits. (In a 32-bit RISC processor, most of the instruction bits are reserved to allow large immediate operands for memory offsets, jump targets, and arithmetic/logic operations).

    The only thing a "32 bit" processor typically isn't good for is "64 bit" arithemtic. (And any arithmetic over 32 bits for that matter). Which means on these, a "64 bit" addition could be performed using 3 "32-bit" additions and a branch. "64-bit" multiplications get even worse...

    But if a program doesn't access much memory ( packed arithmetic whereby it can treat a 32-bit integers as a pair of 16 bit integers and a single operation can calculate both results... But this by itself is hardly justification alone for using such a processor.

    So guess what folks: There will likely never be a "1024bit" processor. (At least not for general purpose computing). I'm not trying to sound like Bill Gates with his "640k is enough" quote, but I don't see why processors will ever use much more than 64 or 128bit addressing. (Keep in mind that EACH BIT *doubles* the range of integer numbers/addresses the procesor can handle efficiently).

    Yes we now can have 2^32 bytes of memory in computers (4GB). But WTF is anyone going to do with 2^64 bytes of ram? Thats probably many orders of magnitude greater than the total capacity of all electronic devices ever produced from the 1950s until now...

    In conclusion, WTF? Mod Editor Down!

    1. Re:Detailed Reponse to Cliff and HackWrench by Anonymous+Brave+Guy · · Score: 3, Insightful
      Yes we now can have 2^32 bytes of memory in computers (4GB). But WTF is anyone going to do with 2^64 bytes of ram?

      I don't know. Then again, ten years ago, if you'd told me that an e-mail client or web browser would require tens of megabytes of memory just to load, or it would require over 100MB just to store the quick start-up code for an office application, I'd have laughed. Right now, that's exactly what Firefox, Thunderbird and OpenOffice 2.0 are claiming on the PC where I'm writing this.

      Actually, I'm still laughing, because that says more than words about the design of those applications and the tools used to compile them. But the applications have expanded to fill the space nevertheless.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  6. It depends by sfcat · · Score: 4, Insightful
    There are many factors that go into deciding how to write code. Portability is just one consideration of many. I would say that it is worth it if speed is of critical importantance and development expenses are of no concequence.

    For instance, consider a video game. The faster it is the more likely it is that players will like it. But there are many more important factors including is the game just plain fun. So in video games, there is really a basic threshold of speed that needs to be met and after that is met, other factors are more important.

    Next consider a real time system for trading stocks. This system is all about speed and reliability. You can control the deployment hardware and it is economically worthwhile to spent a lot in development if it makes more money in the long run. So coding your own memory pooler that uses the size of the pointer and a specific struct to make the code allocate and deallocate memory in constant time (it is very possible) is worthwhile because it can save alot of time per transaction.

    But all of these issues come down to what exactly you are writing and both the technical and business requirements of your project. Without knowning those in advance, we can't really answer your question.

    --
    "Those that start by burning books, will end by burning men."
    1. Re:It depends by forkazoo · · Score: 4, Interesting

      A lot of people are writing responses that tend to assume it is impossible to write code that is portable, and also optimised for a specific platform. I recently read a book called "Vector Game Math Processors" (everybody needs a hobby, right?). Looking at how the examples were coded in that book sort of shifted my assumptions about how I should do things.

      Basically, the book covers the major vector instruction sets: Altivec, PS2, SSE, etc. Naturally, a program written with hand optimised SSE assembly won't run very well on a PowerMac G4. So, the approach the author used was to start by coding a vector math function in plain C. He only calls this function by a function pointer. So, instead of calling sw_vector_foo directly, he calls vector_foo. He then goes on to write altivec_foo, and sse_foo, and gamecube_foo. With some simple #ifdefs at compile time, the function pointer is assigned to the most optimal code path for the platform.

      So, the result is that by thinking about portability going in, he doesn't have to do hardly any work to have fairly optimal hand-tuned vector routines for a new architecture.

      In general, code written to be portable is also much cleaner, and better commented, and whatnot, just because the author was forced tos pend an extra few minutes thinking about how things ought to be put together. I really can't think of any normal case where portability shouldn't be a consideration. On some obscure embedded systems, you might really want to optimise to a super specific piece of hardware, but it is seldom worth it.

      Think about writing GUI apps for a Palm pilot before the switch to ARM CPU's. A programmer could have said, "hey, I'm using the Palm OS API's, and they only run on Coldfire CPU's, so I have no reason to make anything portable." Then, a little while later, Palms OS starts running on ARM. If he had invested a smidgen of extra effort to write his code in a portable way, he could easily start to take advantage of the ARM stuff right away. Since most of the issues of portability are in the planning phase, and get handled at compile time, the difference in memory footprint need not be appreciably larger. (Like a bunch of hand coded ASM for a different platform, which get's #ifdef'd away, or sizeof() operators...)

  7. Does it matter? by Jah-Wren+Ryel · · Score: 4, Insightful

    It used to be that computers were expensive and people were relatively cheap. Nowadays, the reverse is generally true.

    So, unless these systems have performance critical portions, like high-speed digital signal processing where every FLOP counts, it really isn't worth the extra effort to optimize your code for the platform - you'll just end up having to hand-tweak (or even worse, un-tweak) it again on the next hardware upgrade.

    --
    When information is power, privacy is freedom.
    1. Re:Does it matter? by richg74 · · Score: 5, Insightful
      It used to be that computers were expensive and people were relatively cheap. Nowadays, the reverse is generall

      For most applications, the potential performance gains from hand optimization for a specific platform aren't enough to matter. (And, as I think Brian Kernighan said, trying to outsmart the compiler defeats the purpose of using one.) Big performance gains come, in most cases, from figuring out a better way (~algorithm) to solve the problem, not from tweaks.

      There's another aspect of portability that doesn't get mentioned too much: the portability of the programmer. If you are in the habit of writing portable code, it's much easier to shift to working on a different platform. (I'd also say, from my own experience, that it makes your work less error-prone.) That versatility is potentially of significant value to your employer, and of course is of value to you personally.

  8. This is the compiler's job. by stienman · · Score: 3, Insightful
    There is lots of talk about writing portable programs, but this pursuit has resulted in a lot of processor features going unused.

    This is the compiler's job. If your compiler targets a particular processor poorly, get a better compiler.

    There is no such thing as portable code:
    • There is code that is written according to the language specification (Ansi C, Java, etc), which is what one normally considers "portable" only because standards compliant compilers exist for several platforms.
    • There is code that uses processor/platform/OS/compiler specific extensions, which is normally considered unportable because libraries don't exist for all platforms.

    When most developers talk about portability they are talking about OS portability. The portable-to-other-processors debate has long since left the building largely due to incredible speed increases in processors. There's no reason, apart from esoteric algorithm tweaking, to code something in a processor specific manner.

    Code porting to another OS is only an issue because operating systems and the hardware they run on are still changing at a dramatic pace. There is no standardized language that covers all the common aspects of a modern operating system, because they are aiming at a moving target. Even the ultra-portable Java has to be extended outside of the official specification to cover serial ports, complex sound, complex graphics, etc.

    Portability hasn't been about processor speed for a very long time, and at this point it shouldn't be - a better compiler or a faster processor is a *ton* cheaper (time, money) than writing processor specific code in all but a few extraordinary cases.

    -Adam
  9. The performance question by be-fan · · Score: 4, Insightful

    A couple of points about optimization.

    1) Premature optimization is evil. Everybody says this, but so many people do not take it to heart. I'd rather have software that works, than software that is fast but crashes. As a programmer, its nice to work on non-buggy software, even if its not as fast as it could be.

    2) Target-specific optimization is generally evil, unless you're sure your code will not live very long (eg: a game). The thing is that micro-optimizations generally tune for a particular processor, and actually pessimizes the code in the long run. In comparison, if you write good general code, it'll still be fast ten years from now when processors look very different.

    3) The bottlenecks that people, especially C/C++ programmers worry about, are usually not the bottlenecks that usually matter. If you worry that your code could be faster/more memory efficient if you use a 16-bit field here or there instead of a 32-bit one, your algorithms better be absolutely perfect. Most code does not use perfect algorithms. That's why so much software is still so slow. Most programmers just don't get the time to use the best algorithms, much less get down to the level of micro-optimizations.

    That's why I always find language performance debates entertaining. C/C++ programmers will freak out if you tell them language X is very productive, but is maybe two-thirds as fast as C (something that is true of a number of high-level, but compiled, languages). Meanwhile, they will write code that runs at maybe 1/3 of what the machine is capable of, because they spend so much time writing the code they have little time to optimize it.

    --
    A deep unwavering belief is a sure sign you're missing something...
  10. Re:I don't really want my compiler to be very smar by be-fan · · Score: 2, Insightful

    Generally, the time you spent adding useless annotations to your source code would be better-spent with a pencil and paper trying to figure out a way to improve your algorithm. Compilers, generally, are good enough these days. Especially now that GCC is decent and runs on most of the interesting processors. The gains in performance, and this is is something that even the Linux kernel guys have realized, are going to come from good algorithms. This is especially true because of the recent multi-core phenomenon. More and more, "good code" is going to be code that implements good scalable algorithms. Lower complexity beats smaller constant factors any day of the week.

    --
    A deep unwavering belief is a sure sign you're missing something...
  11. Re:Think memory usage, not size... by be-fan · · Score: 2, Insightful

    First, using 16-bit components of registers incurs a stall on most modern x86 CPUs. Remember, they are RISC processors underneath, which have no conception of partial GPRs. Second, RAM is dirt cheap, so let's not even consider blowing RAM. Things get interesting when talking about fitting things in cache, but the simple truth is that if your data doesn't fit in cache, the benefit from just halving its size is usually minimal. As soon as your data set grows, you've blown the cache again. You're almost always better-served trying to figure out how to get your code to operate on data in cache-sized chunks, so your performance stays constantly good, instead of being great with one data set and piss-poor with another.

    --
    A deep unwavering belief is a sure sign you're missing something...
  12. Re:Sure, but premature pessimization is evil, too by be-fan · · Score: 2, Insightful

    Consider, for example, passing a large bit of data as a parameter to a function. In languages that use pass-by-reference semantics, this will typically be cheap. In languages that use pass-by-value semantics, this will typically be expensive. In C++, you have a choice, but the natural (that is, default) is by value. Would you tell a C++ programmer not to use const-reference parameter types from the start, because it's a premature optimization?

    I would tell a C++ programmer that worrying about a bit of extra data copy in the function call is generally useless. It's really not that expensive unless your structs are monstrously large. Generally the question you're interested in is semantics. Do the semantics lend themselves to a pass-by-value, or a pass-by-reference? If, after profiling, you find that this is a problem, use the passing style on those few functions that the profiler points out. Doing it for everything else is useless.

    In some types of software, you simply have to plan for performance from the start.

    Yes, but planning for performance from the start doesn't mean optimizing from the start. It means designing good algorithms and implementing them without any grossly stupid performance mistakes. Optimization can happen after implementation, where profiling shows the need for more hand-tuning.

    Obviously algorithmic improvements make more difference than anything else, but even so, there's a scale between large-scale algorithm and data structure changes and assembly-level micro-optimisation, not a switch.

    It's a scale, but one very biased towards high-level optimizations. Compilers do an excellent job of the low-level stuff. Even at the data structure level, you get a lot more benefit from considering things like ordering your access patterns for cache-friendliness than you do from saving a byte or two here or there.

    --
    A deep unwavering belief is a sure sign you're missing something...
  13. Re:Ever heard of playing just to see what will hap by __david__ · · Score: 2, Insightful
    Yes, and the first hit for "premature optimization is the root of all evil" demonstrates my point exactly. To paraphrase, a good software developer will have developed a feel for where performance issues will cause problems.
    Yes, I totally agree with you and the linked essay on this.
    Making it easy to hand optimize can only help one to develop the feel.
    I disagree here. Read the page you linked to again. The point is that you have to have a feel for the overall design of the program you are making and how that design will work in the end. It is not about how fast you can make memcpy() go (for example)--that can only get you so far. Take for example:
    Because QB always maintained the length of a string, I knew that the fastest way to find an unsorted string was to [search linearly]. Interesting how that doesn't come up as a potential solution for you in your string performance scenario.
    That is because in the context of C (which the discussion is about) the lengths of strings are not known (quickly). For large numbers of strings your algorithm is still orders of magnitude slower than keeping the strings sorted and doing a binary search. That was my real point, and Knuth's point. That optimizing your overall algorithm can yield vast improvements that hand optimizing little sections of code just cannot come close to. This is what the linked essay says that good programmers develop a feel for, not silly little tricks to speed up a single for loop. That's the kind of thing you do very last and only if you have an intensely speed critical application and you've already exhausted optimizing your algorithms--because you're only going to speed things up by small percentages.
    C has a conflict of interest. If it has structures to allow you to write beautiful hand-optimization it loses its reason for existing, so guess what, it doesn't. Ugly hand-optimization is a fault of C, not of hand-optimization.
    If the reason you are talking about is some semblance of portability then you are right. Have you ever read the C rationale? It explains the reasoning of the decisions the C committee made and helps you see things from their point of view. It was very enlightening for me when I first read it. It apparently used to be part of the C standard but they broke it off into a separate document at some point.

    -David