Slashdot Mirror


Porting to 64-bit Linux

An anonymous reader writes "As 64-bit architectures continue to gain popularity it is becoming more and more important to make sure that your software is ready for the shift. IBMDeveloperworks takes a look at a few of the most common pitfalls when making sure your applications are 64-bit ready. From the article: 'Major hardware vendors have recently expanded their 64-bit offerings because of the performance, value, and scalability that 64-bit platforms can provide. The constraints of 32-bit systems, particularly the 4GB virtual memory ceiling, have spurred companies to consider migrating to 64-bit platforms. Knowing how to port applications to comply with a 64-bit architecture can help you write portable and efficient code.'"

17 of 120 comments (clear)

  1. Re:Just a recompile? by cnettel · · Score: 5, Informative
    Unless you assume:
    1. sizeof(int) == sizeof(void*), or
    2. sizeof(int) == 4
    If your codebase only makes the first OR the second assumption, you can tweak the compiler to like you by defines. If you also assume that sizeof(void*) == 4, you have bigger problems. Note that you can do this in rather innocent ways, like dumping a complete structure on disk, knowing that pointer values will be invalid, but just assuming that the structure will be the same size if you read it back later.

    In addition, and this is hellish, a 32-bit MOV is (generally) atomic on x86. You can rely on the high-order word and the low-order word staying together, without race conditions. The memory access semantics are different on x64 and many other platforms. This is not related to 64-bitness per se, you could see if you ported to multi-threaded 32-bit PPC as well, but it will still surface if you do the transition to AMD64/EM64T/x64. Or rather, it will result in an additional one-in-a-million crash in your source, that you'll blame on bad memory chips in the user's machine.

  2. Re:Just a recompile? by baadger · · Score: 2, Insightful

    The answer is no, RTFA. This the exact perception that it lays to waste.

  3. no by sentientbrendan · · Score: 5, Insightful

    Generally architecture changes, compiler version changes, break code on large projects. Over a million lines of code, any tiny little difference in the platform that the original developers didn't think to account for will come up *somewhere*. A good example of this is if you are dumping data structures to disk or network and write a size_t variable. Suddenly, you can no longer communicate between 32 bit and 64 bit versions of your software.

    As a general rule, "just a recompile" *never happens* for any architecture and compiler change on a project above a certain size. Compiler writers break compatibility with some little ol' thing they don't think anyone is using, but which everyone is actually using in *every* version, fail to implement uncommon or difficult language features, add non standard features that other compilers don't support. Then application developers do things like not swapping to network byte order and using architecture dependent data types (size_t as in the example). Between different unices, header file contents will change.

    The fixes are often not that hard (usually trivial) to do between say versions of the same compiler, or endian switches... but they are still there and annoy the hell out off people trying to compile old open source software on a new platform, like say macosx was a few years ago and x86 64 is now. There's always growing pains.

  4. Re:Just a recompile? by Keeper · · Score: 2, Insightful

    That really depends on the code. x64 changes the size of a stardard pointer, but didn't change the size of a word. In the real world (assuming we're talking about an app which was always 32bit only), once you get something to build against x64, you're about 90% done (because coders are human, and people do stupid shit sometimes).

    In my experience, most of the problems will center around using non-pointer types with pointer-types. Mostly around bounds checking, offsets into arrays, pointer arithmatic, etc. I've seen some really aweful code which actually stores pointer values in non pointer types ... bleh. That intermittently breaks in wonderfully interesting ways.

  5. 64bit ain't all it's cracked up to be.. by way2trivial · · Score: 2, Funny

    I have windows XP 64bit edition, and let me tell you- that 128GB ram limit really pulls me down..

    --
    every day http://en.wikipedia.org/wiki/Special:Random
    1. Re:64bit ain't all it's cracked up to be.. by Hal_Porter · · Score: 2, Informative

      It's because of the pagetables. AMD added another level to take the page tables to a 52bit physical address space. The page table entries are compatible with PAE, which most OS's already support. x86 is 3 level, x86-64 is 4 level.

      There's space in the page table entries to handle 64 bits, but adding extra levels to the translation probably has a performance impact. There's still debate about the best way to do 64 bit address translation, and 52 bits is plenty for now. And when they change, it will only affect the bits of the OS that handle paging, user application and even device drives will always see a flat 64 bit virtual address space.

      Like all of x86-64, it's a designed for a subtle mix of good performance in the short term and a painless upgrade path for current OS kernels, without really compromising anything in the long term.

      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
  6. Re:Just a recompile? by ObsessiveMathsFreak · · Score: 2, Insightful

    Provided your code isn't written in assembly, do you really _have_ to do anything else than to recompile it?

    Do you realise how difficult it is to find a healthy goat and sacraficial knife these days?

    --
    May the Maths Be with you!
  7. Most of the time it's easy. by PhrostyMcByte · · Score: 4, Informative

    If you don't make assumptions about pointer sizes in your code, always use size_t in the appropriate places, etc, it is generally just a quick recompile for x64. I find a lot of open source code (I'm sure this isn't exclusive to open source, but, well, I can't see closed source!) spits out hundreds to thousands of warnings about assigning the return of strlen() to an int and other similar and usually harmless things, but most of the time it Just Works (tm).

    The only area I've ran into things being significantly harder is writing clean lock-free algorithms due to the lack of a CMPXCHG16B instruction in the original spec - only EMT64 and very recent AMD64 models have it. There are a couple ways to hack around this limitation but they aren't very pretty.

  8. Been there, done that by ChaoticCoyote · · Score: 2, Informative

    I've been running a 100% 64-bit dual Opteron rig for almost two years, under Gentoo. No emulation libraries, no multilib, just 64-bit code. Other than Open Office, I've had almost no trouble at all.

    BTW, "64-bits" don't make programs run faster (in general) — code compiled for AMD64/EMT64 runs faster than its 32-bit counterpart (for the most part) because of the extra general-purpose registers in the AMD 64-bit design.

  9. Re:Sheesh... by eloki · · Score: 3, Insightful

    Ignoring 64Bit helps a lot to write portable code. For 99.999% of all Apps out there 64Bit is irrelevant, anyways.

    I suspect what you're saying is that there is no particular need for 64-bit in most apps, which I agree with. But the point here is that the program should work correctly, which means code that makes assumptions like pointers and ints being the same size needs to be fixed. The point is that amd64 is making 64-bit platforms relevant to more users, not that everyone thinks most apps will be gee-whiz faster as a result.

    As a side note, some programs may realise minor performance gains on amd64 from having more general purpose registers available. This is, of course, technically nothing to do with it being 64-bit but does mean that there is a potential benefit even if you never need more than 4GB of addressable memory.

  10. Use stdint.h! by Chemisor · · Score: 5, Informative

    The article doesn't appear to mention this, but there is a C99 standard header stdint.h, which defines fixed width types. I haven't seen any OSS project use it, for some reason, but it has all the types you need for portable development; int32_t, uint64_t, constant wrappers like UINT64_C, and, of course, limit constants for all of the fixed-size types. Using these is much better than all those size-based #ifdef'ed typedefs I see people use all over their code.

  11. More subtleties can arise ... by AK76 · · Score: 5, Informative

    I did a lot of 64-bit cleaning up for the PHP project, and I can tell you that there are more subtle issues that may arise when porting from 32-bit to 64-bit.

    One example:
    on a 32-bit Intel machine, a double is precise enough to distinguish LONG_MAX (the highest representable long) from LONG_MAX+1 (a number that doesn't fit in a long anymore). So for instance, to determine whether a long multiplication has overflowed, you could repeat the same multiplication using doubles and compare the result to (double)LONG_MAX.
    In contrast, on a 64-bit platform LONG_MAX and LONG_MAX+1 are mapped to the same double representation, so there's no way to do the comparison anymore.
    As this example involves static casts, it is something the compiler will usually not warn you about.

    Another thing to be careful about is passing pointers to variadic functions (eg. sscanf), because usually the compiler doesn't know the expected types, as they are buried in the format string, not in the function prototype.

    1. Re:More subtleties can arise ... by AK76 · · Score: 2, Insightful

      First of all, this dumbass incompetent nutjob didn't write this code in the first place. It's a real world example of code I happen to have fixed because it turned out not to work on 64-bit.

      However, while it is indeed a hack, I would like to challenge you to suggest a better version that:
      a) is as portable (so no assembly for checking overflow flags that for instance Alpha doesn't have)
      b) uses only 1 multiplication and a comparison in case of overflow, and 2 multiplications otherwise.

      The point is, from a pragmatic point of view, this is a very valid solution that has always worked and produced 100% correct results until 64-bit CPUs came about.

  12. Re:/lib was botched, so yes you must port librarie by dastrike · · Score: 2, Informative
    We'll have a /lib directory without libraries, and the "/lib64" wart lasting until the end of time

    Nah. That's a bit pessimistic outlook. Already today /lib64 is a mere symlink to /lib on current distributions. The symlink may have to be kept around of for a while though until the early nomenclature oopses have been effectively phased out.

    $ uname -m
    x86_64
    $ ls -ld /lib*
    drwxr-xr-x 17 root root 4544 2006-03-26 14:52 /lib
    drwxr-xr-x 2 root root 2120 2006-04-02 13:59 /lib32
    lrwxrwxrwx 1 root root 3 2006-02-28 03:09 /lib64 -> lib
    --
    while true; do eject; eject -t; done
  13. Even well written code can have problems by tlambert · · Score: 2, Insightful

    Even well written code can have problems.

    Specifically, say I have a 64 bit platform capable of running both LP64 code and ILP32 (legacy) code.

    I use a shared memory segment to communicate between my legacy 32 bit applications, and it has internal use of pointers to perform self-reference on data.

    [Rather than complicating things, let's just assume that the pointers are internally based off the base address of the shared memory segment, rather than being based off of 0, so there is no requirement of mapping the memory into the same location in each process]

    I'm now adding a 64 bit computation engine (perhaps my application is a rendering system that uses plug-ins, and being able to work on large data sets with the large address space afforded to 64 bit processes is critical, but when it comes to displaying the results, I can live easily in a 32 bit address space, so I'm not trying to port my whole tool over to 64 bits).

    So now I have to deal with the internal pointers in the shared memory segment. I can do one of several things:

    (1) I can use structure coercion to treat the pointers as if they were integer offsets instead, and coerce them into pointers internally in the 64 bit code (on LP64, pointers are 64 bit).

    (2) I can intenrally store 64 bit pointers, rather than 32 bit pointers. This means I need the same round-tripping, but it can take place in the 32 bit applications, rather than the 64 bit applications, and the Integer representation is as "long long" as far as its concerned.

    (3) I can support either a "short void *" in 64 bit applications, or a "long void *" in 32 bit applciations.

    If I go with approach #3, I get to keep my type checking. With the other two approaches, I have explicit coercion, and I lose my type checking and boundary/range checking: the explicit casts quiet the warnings, even when they are used incorrectly.

    If I go further, and allow the segment to be mapped anywhere in memory, it may be mapped over 4G. I might also have relative base addressing (e.g. listerner converts), where I store the internal base address in the provider as part of the data being provided). This may sound like a strange scenario (e.g. it's like DCE RPC, in that it becomes the receiver's responsibility to convert, if a conversion is needed), but it's very useful. It has the following attributes:

    (a) If I use homogeneous consumer/providers, no conversion is necessary

    (b) My "work horse" application can do their work, and it's up to my "viewer" applciation to do the conversion; presumably, it's not doing much other than interacting with a slow human, so this ends up being the best division of labor

    (c) As time goes forward, the rest of my application is likely to migrate to 64 bit as well, so I get performance improvements over time, as the coversion requirements drop out.

    You could argue that because the program was not 64 bit clean, it's not "well written". You could also argue that losing the compiler warning checking is "OK, because it's your own fault for not porting everything" (if you didn't believe in closed source third party plugins over which you had no control).

    I would argue that you can expect someone to accurately predict future users of their software, and there's only so much work you can do to make sure that things don't break horribly at some arbitrary point in some arbitrary compilation environment.

    For the most part, we have to rely on our tools.

    And our tools do not tell us when this type of problem happens, because this type of problem is relatively new.

    -- Terry

  14. 9-bit by r00t · · Score: 2, Informative

    char was 9-bit

    C requires at least 8 bits for char, so 6 isn't good enough.
    All types must be a multiple of the size of char, because
    sizeof(char) is 1 by definition and fractions are not OK.

    Valid sizes are thus: 9, 12, 18, 36

    The char-short-int-long progression may be one of:

    9,18,18,36 a likely choice
    9,18,27,36 this is the cool way: sizeof(int)==3
    9,18,36,36 a likely choice
    9,27,27,36
    9,27,36,36
    9,36,36,36 a likely choice
    12,24,24,36
    12,24,36,36
    12,36,36,36
    18,18,18,36
    18,18,36,36
    18,36,36,36
    36,36,36,36

  15. Re:Specifying bit lengths by larry+bagina · · Score: 2, Informative

    it could be the fact that the preprocessor doesn't understand sizeof()

    --
    Do you even lift?

    These aren't the 'roids you're looking for.