Slashdot Mirror


Effect of Using 64-bit Pointers?

An anonymous reader queries: "Most 64-bit processors provide a 32-bit mode for compatibility, but 64-bit pointers are becoming essential as systems move beyond 4GB of RAM. Also, the large virtual address space is very useful for several reasons - allowing large files to be memory-mapped, and allowing pages of memory to be remapped without ever requiring the virtual address space to be defragmented. However, 64-bit pointers take up twice as much memory, which immediately affects memory footprint. This is especially an issue on embedded platforms where RAM is at a premium, but even on systems where RAM is plentiful and cheap the extra memory footprint reduces cache performance. Have Slashdot readers done any research into the actual effect of using 64-bit pointers in a 'typical' application? What proportion of a real program's data is actually pointers?"

164 comments

  1. easy... by edrugtrader · · Score: 5, Funny

    Have Slashdot readers done any research into the actual effect of using 64-bit pointers in a 'typical' application?

    none whatsoever.

    What proportion of a real program's data is actually pointers?

    none whatsoever.

    oh... i use java.

    --
    MARIJUANA, SHROOMS, X: ONLINE?! - E
    1. Re:easy... by El · · Score: 5, Funny

      Does Java handle datasets larger than 4GBytes, or does it run so slowly that nobody has been able to find out whether or not it handles them? In the underlying implementation, isn't EVERY object actually a pointer?

      --

      "Freedom means freedom for everybody" -- Dick Cheney

    2. Re:easy... by gl4ss · · Score: 5, Insightful

      every object(which is everything apart from your basic int & etc) is a reference, which pretty much is a pointer with a fancy name. as to handling 4gbytes I really don't see why it couldn't, it's just a matter of the vm supporting it anyways(afaik the design, nor the bytecode, limits it).

      however, can you think of any system where you had objects, sets of data, and they weren't (at least underneath) pointers to memory?

      and as to the original subject one poster already said it best: if you really have the need for that extra effort of going 64bit pointers you will probably have the memory to spare, no? anyways it will only be a problem if if the the pointers are big enough in comparision to what they're pointing to.. in which case you should rethink what you're doing anyways probably if you care a squat about memory footprint.. bringing the embedded devices to the discussion at this point is totally pointless but of course cool sounding and slashdot editor catchy.

      bleh I'm no expert anyways.

      --
      world was created 5 seconds before this post as it is.
    3. Re:easy... by E_elven · · Score: 1

      I believe the parent referred to heap memory rather than pointers -and in this they are most likely correct.

      The second part I agree with..

      Etta nain.

      --
      Marxist evolution is just N generations away!
    4. Re:easy... by Waffle+Iron · · Score: 5, Insightful
      oh... i use java.

      If you'd pay proper attention to Sun's marketing machine, you'd remember that Java uses a just-in-time compiler. What does a compiler do? It turns all of your "object-oriented is the only valid programming paradigm" source code into a big bucket of CPU-specific opcodes, numbers and *pointers*.

      In fact, it will probably have more pointers than the corresponding C or C++ program would have, due to the plethora of tiny objects you're encouraged to spawn. Naturally, the pointer size would match the CPU architecture on which the program is being run and would consume a corresponding number of cache bytes.

    5. Re:easy... by buttahead · · Score: 0, Redundant

      +1 insightful.

    6. Re:easy... by jstarr · · Score: 5, Informative

      Java does not care about memory limits, the JVM does. The stock Sun JVM for x86 machines will address a maximum of 3-4 GiB (dependent on operating system). However, the IBM JVM on an AIX machine has no practical limit and can easily access >16 GiB memory, if available. If a JVM is so designed, there is no reason a Java program can access as much memory as a program written in C.

      I run very large simulations on various platforms, and some of my simulations have to be run on a 64-bit machine because of the memory requirements. Sun's Java forums have several posts asking for various maximum heap (maximum memory accessable) for various platforms and you can find more exact numbers for specific platforms and operating systems there.

      An object is an object, not a pointer. However, objects are accessed through a reference, which in implementation, is typically a pointer.

    7. Re:easy... by addaon · · Score: 1

      However, arrays are limited to 2^32 elements in size.

      --

      I've had this sig for three days.
    8. Re:easy... by Yuioup · · Score: 2, Informative

      That's not true!!

      When you create an object in Java, you are, in a sense, creating a pointer. As a matter of fact it's easy to make a linked list or a binary tree with Java, the same way you do in C. Just because it's not explicitly called a pointer doesn't mean it isn't used.

      Ever heard of a NullPointerException?

      "Java doesn't have pointers" is a hype phrase still left over from the Dot Bomb era...

    9. Re:easy... by Anonymous Coward · · Score: 0

      Wow, you don't have a clue about how Java works.

    10. Re:easy... by bunratty · · Score: 1

      Nope, 2^31 elements. ints are always signed, and you cannot use negative indexes in Java arrays.

      --
      What a fool believes, he sees, no wise man has the power to reason away.
    11. Re:easy... by bunratty · · Score: 1
      every object is a reference
      No, objects and references are completely different things.

      Objects can vary in size, always live on the heap, and are always instances of a concrete (non-abstract) class.

      References always have the same size, can live on the heap or on the stack, and can have any type (class or interface).

      References are the things that point to objects. Every time you deal with an object you do it by way of a reference to the object. But that doesn't mean that objects and references are the same thing!

      --
      What a fool believes, he sees, no wise man has the power to reason away.
    12. Re:easy... by addaon · · Score: 1

      Yup, good catch. I was wrong.

      --

      I've had this sig for three days.
    13. Re:easy... by GuyZero · · Score: 1

      Yes, Java uses pointers under the hood. Object references are by definition in the language spec a 32-bit data type.

      There is a 64-bit VM for Solaris from Sun and there may also be one from IBM for AIX.

      I saw a presentation at JavaOne from someone using the 64-bit VM to bring massive datasets into memory and not do any I/O during a large dataset transformation operation.

      The 64 bit VM team fron Sun said, in a different session, that the base performance is about 75% that of the normal VM. So you pay some penalty for using 64 bits, but you presumably rearchitect your app and make it up somehow.

    14. Re:easy... by Ninja+Programmer · · Score: 1
      as to handling 4gbytes I really don't see why it couldn't, it's just a matter of the vm supporting it anyways(afaik the design, nor the bytecode, limits it).
      Strings are necessarily at most 4GB is length. This is part of the definition of the language. Therefore there are at least *some* objects which are limited to a 4GB size.

      Also integers are 32bits exactly in Java, so all arrays are necessarily limited to having (about) 4 billion entries. Though, of course, each entry may be more more than one byte in size.
    15. Re:easy... by Anonymous Coward · · Score: 1, Funny

      Of course it can handle datasets larger than 4 gigs. How else could you write a HelloWorld program with it ?

    16. Re:easy... by be-fan · · Score: 1

      I hate to break it to you, but in Java, everything except primitives is accessed via a pointer. Sometimes, even a via a double-pointer (pointer-to-pointer) depending on the VM.

      --
      A deep unwavering belief is a sure sign you're missing something...
    17. Re:easy... by Anonymous Coward · · Score: 0

      Thank you Mr. Pedant!

    18. Re:easy... by Zan+Zu+from+Eridu · · Score: 1
      Also integers are 32bits exactly in Java, so all arrays are necessarily limited to having (about) 4 billion entries. Though, of course, each entry may be more more than one byte in size.

      Are you sure the total (byte-)size of an array can exceed 4Gb? As I recall, both the reference and returnAddress types are Category 1 (32-bit) in the VM specification, which implies 4Gb is the maximum size of both data and bytecodes.

    19. Re:easy... by Ninja+Programmer · · Score: 1
      Are you sure the total (byte-)size of an array can exceed 4Gb?
      No, I have just been learning the language, not the VM. You might be right. But, I think it would be a crying shame if it were the case -- why not leave addresses/handles as abstracted values that the VM doesn't know the size of? So at least that way there is a way to sort of move to 64 bits.
    20. Re:easy... by Zan+Zu+from+Eridu · · Score: 1
      why not leave addresses/handles as abstracted values that the VM doesn't know the size of?

      Because it would slow down java considerably? Java is very much a reference based language, and abstracting references in the VM would mean considerably more machine instructions to be run for every reference in the bytecode. Abstractions you put in the VM have effects on most or all bytecode you run on it, so you basicly don't want to do expensive abstractions in the VM.

      So at least that way there is a way to sort of move to 64 bits.

      The obvious solution would be to specify a 64-bit VM in a new version of java, but I guess this would slow down this new version of java on all 32-bit platforms, which would have to emulate 64-bit instructions for all basic operations like referencing, integer math, etc. (This slowdown would be less than abstracting references though, because you don't have to take other bitsizes into account and can do some optimizations. Basicly, having abstract references comes down to handling bignum pointers.)

      I could see a problem coming up with people producing client-side java apps trying to slow down the migration because their customers run 32 bits, while the server-side guys want 64 bits yesterday. This is of course assuming we want to keep java portable in the java-sense.

  2. Embedded 64-Bit by MBCook · · Score: 3, Insightful
    If you were going to build something that used embedded 64 bit processing, why would you choose a processor with a 64 bit address space? If you need that much address space, then chances are you can handle the extra RAM needed by the pointers, right?

    Is this really a problem in the embedded space?

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    1. Re:Embedded 64-Bit by T-Ranger · · Score: 1

      More to the point, if you NEED lots of adresing space, then you will HAVE lots of space to store it, no?

    2. Re:Embedded 64-Bit by Anonymous Coward · · Score: 0

      I'm guessing that anyone needing that much performance as would constitute a 64-bit processor would probably just switch to a very fast DSP optimized for the particular purpose. I can't think of any reason to pack that much raw, unfocused power into an appliance. Versatility is not usually a trait of embedded devices.

    3. Re:Embedded 64-Bit by PenguinOpus · · Score: 5, Insightful

      You missed a good point in the original question. Even if you have tons of RAM, cache size is not growing as quickly and you will thrash your data cache far more quickly if all your pointers double in size. I don't know if immediate mode addressing instructions are common for 64bit operands but if they are, it could thrash your icache sooner as well.

      Bandwidth from memory to cache will also be used by these larger pointers.

      OTOH, other than disk controller caches (?), what kind of embedded systems need more than 4GB online simultaneously ?

    4. Re:Embedded 64-Bit by Pseudonym · · Score: 5, Informative
      OTOH, other than disk controller caches (?), what kind of embedded systems need more than 4GB online simultaneously?

      There's a lot of modern medical equipment which can definitely use the 4GB. MRI machines, CT scanners, ultrasound machines ("sonographs" if you prefer the term) and so on do tend to chew up memory. Particularly the first two, because you often need to hold whole voxel sets in memory while you compute a bunch of cross-sections at odd angles.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    5. Re:Embedded 64-Bit by Grab · · Score: 4, Insightful

      I'd say this isn't a problem, will never be a problem, and the person who posted that initial question really doesn't know shit about embedded.

      Embedded devices come in all sorts of varieties from 4-bit to 64-bit, and will do for the foreseeable future. When you're producing X million chips, the software is amortised to basically nothing and the hardware cost becomes the primary concern, so there is no chance that lower-spec chips will ever go away in the future.

      So you're not going to be forced to use a 64-bit chip in your design, just because the chip company has stopped selling the lower spec ones. In the PC business this does happen, because there's no demand for older, lower-spec chips. In the embedded market though, the demand is there and will continue to be there, so the situation has not and will not arise.

      If your target application needs 64-bit processing, you choose a device that does 64-bit processing, and you choose RAM size to suit. If you don't need it, you don't choose it. Simple.

      Someone elsewhere had some questions about internal registers/internal RAM. Well as with all processors, some give you enough registers and some don't. Again, the engineer just has to pick the processor that gives the capabilities they want.

      Grab.

    6. Re:Embedded 64-Bit by Scherf · · Score: 2, Interesting

      OTOH, other than disk controller caches (?), what kind of embedded systems need more than 4GB online simultaneously ?

      Some CAD Programms used in Mechanical Engineering (CATIA V5 for example) could use that much. Loading a whole car engine into one of these Programms will exceed 4 GB pretty quickly.

    7. Re:Embedded 64-Bit by j3110 · · Score: 3, Interesting

      I think the cache arguement is complete BS. It appears it would be true, but not really. Most pointers are created and controlled by the compiler, and they are going to be relative 90% of the time. That's why relative addressing was invented. So, you get an extra 4 bytes stuffed into your cache on the relatively rare occurance that one of your 8 byte pointers are being used. In this 8 byte pointer, I'm just going to assume that you aren't idiot enough to be accessing memory in the same page most of the time. I really think that the page switching of the RAM to access the data at the other end of the pointer is going to be the greatest overhead. Normal cache misses in a 64bit addressing scheme should be exponentially more than a 32bit, if you really needed the 64bit.

      So while you may have a caching problem, I think it's going to be because of accessing more data rather than the 4 bytes extra on some pointers.

      Now if you're using disk based data structures, you better be using 64bit. I could make an exception if you used a 32bit number to address the cluster, then a 16bit number to access the actual data in the cluster, if required. A good DB server would do well to use 32bit cluster numbers to save index size, then scan the loaded cluster for the record. AFAIK, no one has been clever enough to do this, but I'm not privy to the internal structures of a lot of DBMSs. And this would matter a lot, because you could fit much more of the index into memory, and have much less data to read on the drive. Throwing away CPU cycles and memory for more compact disk data is a common practice.

      --
      Karma Clown
    8. Re:Embedded 64-Bit by angel'o'sphere · · Score: 1

      Most pointers are created and controlled by the compiler, and they are going to be relative 90% of the time. That's why relative addressing was invented. So, you get an extra 4 bytes stuffed into your cache on the relatively rare occurance that one of your 8 byte pointers are being used.


      This makes no sense to me.
      class Person {
      string name, surname;
      Address* address;
      PhoneInfo* phoneinfo;
      ...
      }

      class Address {
      string street;
      string number;
      string zipcode;
      ....
      }
      class PhoneInfo {
      string countrycode;
      string areacode;
      string number;
      ....
      }
      The STL string is not itself a pointer but manages an array(a pointer) to its data, char or wchar.
      Besides that enarly all data structures are pointers. Of course you could optimize a bit ... e.g. using aggregated objects instead of pointers. But the point is to show that a standard C/C++ program will have a LOT 64 bit pointers if you switch to it.

      OTOH if you have a seperated code and data cashes or if you work with arrays, having the pointer in a register, the cash influece should be not that high(in the first level cash). The second level cash will be of course another issue as a lot of the objects, like simple ones which I have sketched, are now twice as big.

      The only pointers which are relative all the time, are the pointers used to fetch a field from a struct/class. But the field fetched is itself a pointer, which is now twice as big. The other relative "pointers" are jump destinations ... and I don't think one did consider them as pointers in this discussion.

      Regards,
      angel'o'sphere
      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    9. Re:Embedded 64-Bit by j3110 · · Score: 1

      I gaurantee you that you'll have more data in the class than the pointer to it always... or else you're just being really stupid. I'm also assuming that most people are going to have long addresses and many phone numbers. If you add in the other fields to those objects, I think you're looking at your data being anywhere from 10-30% pointers even in a 64bit world.

      Most pointers are code relative jumps IE function addresses. Most people should probably look more for data segmentation than absolute addresses anyhow, especially on embedded devices where memory tends to be much slower. Switch pages only when you need to. Most 64bit addresses are only going to need maybe 4-8 bits above the 32bit mark... even for VM. It would probably be better to store 8bit pointers then combine them with a 32bit offset to generate the 64bit absolute address. I can't think of many application that would benefit from the terabyte of address space this would give though. You're application will likely not need that for memory addressing.

      Like I said in a previous post, there are ways to work around it for huge databases too. 4K disk blocks * 2^32 = 16TB database. I don't know of any database that has crossed the terabyte threshold yet, but it's very likely there are some... On the kinds of machines that you would use databases that large, it's still going to be better to increase the disk block size than the 32bit pointers. It's not uncommon to have 64K clusters on desktop machines these days... Thats 256TB.

      --
      Karma Clown
    10. Re:Embedded 64-Bit by angel'o'sphere · · Score: 1


      I gaurantee you that you'll have more data in the class than the pointer to it always... or else you're just being really stupid. I'm also assuming that most people are going to have long addresses and many phone numbers. If you add in the other fields to those objects, I think you're looking at your data being anywhere from 10-30% pointers even in a 64bit world.


      You seem nt to understand.
      I'm not talking about pointer TO objects, but about pointers INSIDE of those objects to other objects.
      And yes, those are over 50%. The objects I sketched, are jsut vanilla ... and they ONLY consist out of pointers or are wrapped pointers like the STL string class.

      Most pointers are code relative jumps IE function addresses.
      It would probably be better to store 8bit pointers then combine them with a 32bit offset to generate the 64bit absolute address. Sure ... and you have imediatly a bunsh of unmaintaneable code :-)
      And how, btw, do you ensure that this is even portable from compiler to compiler or on the same compiler from processor revision to processor revision?

      Probebly you should TRY the stuff you are suggestion once in a slightly bigger than "toy project".

      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    11. Re:Embedded 64-Bit by Anonymous Coward · · Score: 0

      couple things: you are correct in that the average joe coder shouldn't care too much about these issues, since he probably never knew anything about his programs' cache behavior in the first place...but
      1. Even when you do not have a lot of pointer heavy data structures, a load/store of a pointer from the stack (a var passed as a parameter or spilled local) is accessed relative to a frame pointer, but that doesn't negate the fact that {int *foo} is register (8 bytes) size. With all other things being equal, a 4 byte pointer [load or store] in 32b generated code will turn into an 8 byte pointer [load or store] in the corresponding 64b case. If you study code outputs on the SGI MIPS, which has extremely similar 32 and 64 bit modes, you can see this is the case- lw turns into ld and so forth. Arch. designers in the last few years have been well aware of the huge amount of function calls and object pointers tossed around in current code; reduction of [apparent] spills can be achieved in hardware through register renaming and also through things like the itanium register stack. Compiler folk have been working a lot of magic as well.

      2. I'm not sure what you are talking about with page switching. It is ok to access memory across tons of pages so long as it is all in core and the TLB (translation lookaside buffer) behavior is such that address computation can be done efficiently. Some profilers yield information on working sets, TLB and page access behavior, if you are interested.

    12. Re:Embedded 64-Bit by j3110 · · Score: 1

      For one, the TLB has to be bigger for 64bit addresses.

      Secondly, even RAM running at 333MHz can only switch pages at about 111MHz optimally. Everytime you leave a 4K barrier, you might as well put 20 NOP's in your code.

      You're going to have some cache misses... It's best if you try to make them only have to do a row select on RAM. If you have huge data structures that warrent 64bit addressing, you aren't going to have as much cache to help prevent this. In fact, you should be flipping pages almost all the time.

      When allocating memory for objects, you should allocate it all together, and you need to set a maximum size for your object, then use relative addressing.

      The neat thing about VMs is that they will do this for you. Some defragment your memory periodically. They manage all your pointers, so they know what size they should be. VMs generally allocate contiguous heaps of memory to avoid paging when possible too. There are a lot of advantages to VM software, that until recently haven't been leveraged enough to make it show on benchmarks.

      --
      Karma Clown
  3. Embedded platforms?!? by El · · Score: 3, Interesting

    How many embedded devices are running 64-bit processors now? Offhand, I'd say this is only a problem if you have an embedded device with more than 4 GBytes of memory... in other words, it hardly sounds like a real-world problem for embedded devices. Yes, workstations and servers with 64-bit processors should probably be using 64-bit pointers.

    --

    "Freedom means freedom for everybody" -- Dick Cheney

    1. Re:Embedded platforms?!? by Smidge204 · · Score: 1

      In fact, a very small portion of embedded devices are even 16bit, and I can't think of any that are 32bit... What's the point of a 64bit embedded device? RAM is at a premium but you still need >4GB of it?

      Or maybe what I'm used to calling an "embedded" device isn't the same as the submitter's...
      =Smidge=

    2. Re:Embedded platforms?!? by MerlynEmrys67 · · Score: 5, Insightful
      Worked on a Xeon based embedded platform that could have 16 GB of Ram on the system board... You forgot that Intel provides a segmented architecture didn't you ?

      By the way, the limit was from physical slots - 8 and a 2GByte DIMM memory limit, increase either of those and guess what.

      Now each "process" on our box could only address 4 Gbyte of that memory, but that was a completely different question (and in fact limited by the libraries that were used - again a different story)

      I remember these conversations when the 32 bit world came around - what do you mean I have to put 4 bytes into the processor. End result is that the code is a little larger, and a little slower - and Moore's law marches on and we don't even notice

      --
      I have mod points and I am not afraid to use them
    3. Re:Embedded platforms?!? by Anonymous Coward · · Score: 0

      There was an article three days ago about a MIDI keyboard (y'know, a digital musical instrument).

      It is available with dual Athlon 64 processors and it runs Windows XP Pro.

    4. Re:Embedded platforms?!? by davetm · · Score: 2, Interesting

      Personally, both types of embedded device I've worked on have been 32 bit. The first was a database engine (think network attached storage^H^H^H^H^H^H^Hdatabase of several Terabyte dataset size), and the second is set top boxes for digital tv. In the case of the first I can immediately see the need for 64bit arithmetic AND addressing. In the case of the set top box I think 32 bits will be fine for a while yet; there is pressure for faster processors, but not for 64 bit arithmetic.

      --
      -- Dave
    5. Re:Embedded platforms?!? by addaon · · Score: 3, Interesting

      Many 32 bit platforms, including x86, PowerPC, etc, support 64GB of ram... but only 4GB of address space. Most people want more than 4GB of address space, but don't yet care about more than 4GB of ram.

      --

      I've had this sig for three days.
    6. Re:Embedded platforms?!? by Endive4Ever · · Score: 2, Insightful

      Two of the three projects using embedded devices that I wrote the code for used 4 bit processors. The third used an 8 bit processor. There are still many millions of 4 and 8-bit processors being designed into products. You can't even order a mask for said parts (the vendor won't even answer the phone, or often times even provide the emulator and tools) if you're not talking 500K+ quantities.

      --
      ---
    7. Re:Embedded platforms?!? by Waffle+Iron · · Score: 2, Informative
      How many embedded devices are running 64-bit processors now?

      I believe that the 64-bit capable MIPS architecture found it's biggest success in the embedded processor market. From the wikipedia entry:

      In recent years most of technology used in the various MIPS generations has been offered as building-blocks for embedded processor designs. Both 32-bit and 64-bit basic cores are offered, known as the 4K and 5K respectively, and the design itself can be licenced as MIPS32 and MIPS64. These cores can be mixed with add-in units such as FPUs, SIMD systems, various input/output devices, etc.

      MIPS cores have been very successful, they form the basis of many newer Cisco routers, cable modems and ADSL modems, smartcards, laser printer engines, set-top boxes, handheld computers, and the Sony PlayStation 2.

    8. Re:Embedded platforms?!? by KnightStalker · · Score: 1

      Are video game consoles considered "embedded" devices? Seems to me they share many of the same characteristics. (Judging by a quick google search they are at least often described as embedded.) Several of those are 64-bit or more.... Jaguar, N64, PS/2, Dreamcast, etc.

      --
      * And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
    9. Re:Embedded platforms?!? by buttahead · · Score: 1

      maybe I'm retarded... probably, as a matter of fact. Why would a person want more address space than usable memory?

    10. Re:Embedded platforms?!? by Radius9 · · Score: 1

      That is basically register size. The pointers are still 32 bit. For example, the playstation 2 is considered a 128 bit machine. All the pointers are 32 bit, as is a regular floating point number, but I can process 4 floating point numbers in a single instruction. This is great for things like 3D games where you need to do a lot of linear algebra on vectors and matrices.

    11. Re:Embedded platforms?!? by bobthemonkey13 · · Score: 4, Informative
      Actually, it can go either way:
      • More address space than physical RAM: Swap space, memory-mapped files, shared memory/IPC, or any other use of virtual memory that doesn't map onto physical memory. This is why 64-bit address space is good even for desktop machines that have less than 4GB of RAM.
      • More physical RAM than address space: Ten processes, each using a single 4GB memory space, can consume 40GB of physical RAM. This is how and why you can put more than 4GB of memory in an x86 machine -- the processor maps from (I believe) 36-bit physical addresses to the 32-bit addresses that processes see.
    12. Re:Embedded platforms?!? by epine · · Score: 4, Informative

      I hate to confirm your self diagnosis, but I have sad news to bear.

      If you wish to use memory mapped IO to your file system, which has some good technical properties, you need a pointer with an address range *at least* as large as the largest possible file you might need to access, and preferably as large as the largest file system you intend to mount.

      Addressibility and physical storage are somewhat orthogonal. (In theory, there is no difference between theory and practice, in practice there is.)

      On a machine with 10G of memory, there is no reason for a process to use 64-bit pointers if the process doesn't require more than 32 bits of addressibility. If you look at Apache in the standard threading model, every request is managed by a different process. I doubt you need 64-bit pointers for *each* PHP instance, regardless of how much physical memory the machine contains.

      On the other hand, you might be doing some kind of video stream manipulation on a 10GB file using a machine with only 1GB of physical RAM. You would require the use of 64-bit addressibility for this task if you choose the memory mapped IO model.

      So yes, you are retarded, but it could be cured by thinking before you type (the post does mention memory mapped IO). There: ten simple words of advice that should apply to 2^33 members of the slashdot community.

    13. Re:Embedded platforms?!? by buttahead · · Score: 1

      thanks... makes much more sense.... didn't think about memory mappping files.

    14. Re:Embedded platforms?!? by buttahead · · Score: 1

      retardedness is rarely noted so tactfully. gracias.

    15. Re:Embedded platforms?!? by Anonymous Coward · · Score: 0

      the playstation 2 is considered a 128 bit machine.

      Only by people who can't bear the thought that it's really a 32-bit machine because they're stuck in the early-90s "more bits == better games" rut.

      Repeat after me: the PS2 is a 32-bit machine. The PS2 is a 32-bit machine. This is not a negative judgement on its capabilities, it is merely a fact.

    16. Re:Embedded platforms?!? by AuMatar · · Score: 1

      I'm programming on a 32 bit embedded processor right now.

      Of course, we won't be going to a 64 bit chip in the near future, if ever.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    17. Re:Embedded platforms?!? by cant_get_a_good_nick · · Score: 2, Interesting

      The HURD code to access disks uses mmap() calls, so is currently limited on 32 bit architecture to 2GB disks. Every partition has to be less than 2GB, which is a pain in the ass for todays >100GB drives.

    18. Re:Embedded platforms?!? by TwistedSquare · · Score: 1

      or accepted so gracefully..

    19. Re:Embedded platforms?!? by kscguru · · Score: 1
      In other news, the Playstation 2 also has a "mere" 32MB of RAM running at low bandwidth, so any pointer larger than 32 bits is completely and utterly useless. (It doesn't do virtual memory either - no swap device).

      That being said, yes, the PS2 has an absolutely beautiful processor setup. Inter-processor bandwidth galore and extremely custom caches (think DMA on steroids). The overhead of doing extra-bit calculations (e.g. SIMD instructions) disappears entirely behind the more optimized architecture.

      --

      A witty [sig] proves nothing. --Voltaire

    20. Re:Embedded platforms?!? by K-Man · · Score: 1

      it could be cured by thinking before you type


      Are you trying to put this site out of business?

      --
      ---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
    21. Re:Embedded platforms?!? by be-fan · · Score: 1

      Actually, the Playstation 2 has a lot of RAM bandwidth. 3.2GB/sec of memory bandwidth from a 32-bit RDRAM setup. That is quite a lot considering that the system was released more than four years ago.

      --
      A deep unwavering belief is a sure sign you're missing something...
    22. Re:Embedded platforms?!? by cybergrue · · Score: 1

      If you wish to use memory mapped IO to your file system, which has some good technical properties, you need a pointer with an address range *at least* as large as the largest possible file you might need to access, and preferably as large as the largest file system you intend to mount.
      Err, no. You can mmap a portion of the file up to the limit of the pointer. ie. you can mmap a 1GB section of a 10GB file, and then keep moving the 1GB window till you have processed the entire file using long longs for the offset in the file, and 32 bit pointers for the offset inside the mmap. Watch out, sometimes the OS doesn't let you access the whole 2GB of memory allowed by the signed 32 bit pointer because the OS is using some of it for overhead. (1.8 GB out of 2GB was avaliable on the machine I had this problem on)

    23. Re:Embedded platforms?!? by kscguru · · Score: 1

      Ah, nuts, yeah. Substitute "low bandwidth" for "moderate latency" because of stupidity. :-) (Okay, so I was too busy drooling over processor-to-processor bandwidths to remember the main memory interface...)

      --

      A witty [sig] proves nothing. --Voltaire

    24. Re:Embedded platforms?!? by EelBait · · Score: 1

      Ever heard of "swap"?

      Unix has been providing more address space than usable memory for a long time.

    25. Re:Embedded platforms?!? by angel'o'sphere · · Score: 1


      maybe I'm retarded... probably, as a matter of fact. Why would a person want more address space than usable memory?

      I bet my pants your computer has more address space than physical memory.
      And if you would try to buy as much physical memory as you have address space, you will realize that it:
      a) won't fit into your computer
      b) you very likely can not afford it
      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  4. Don't use 64-bit pointers on such systems. by Green+Light · · Score: 5, Insightful
    However, 64-bit pointers take up twice as much memory, which immediately affects memory footprint. This is especially an issue on embedded platforms where RAM is at a premium

    Huh? On systems where RAM is at a premium, I don't see the point of using or having 64-bit pointers.
    --
    "Send an Instant Karma to me" - Yes
    1. Re:Don't use 64-bit pointers on such systems. by Pseudonym · · Score: 4, Informative

      The poster named one point: mapping large files.

      Using mmap() for certain kinds of I/O is very, very useful in performance-sensitive applications. Using POSIX I/O (i.e. read(), write() and its relatives) means that your data must go through memory twice: once from disk into the buffer/page cache and then once again into userland. Memory-mapped I/O effectively unifies the two, saving on precious memory and memory bandwidth.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    2. Re:Don't use 64-bit pointers on such systems. by Anonymous Coward · · Score: 0

      What sort of embedded platform has very limited RAM but handles files larger than 4 gigs?

      (I'm not trolling here - if there is such an application, I'd be interested to hear about it.)

    3. Re:Don't use 64-bit pointers on such systems. by ratboy666 · · Score: 1

      If a read() uses a page-aligned buffer, from a page-aligned source, then why wouldn't the OS map a page directly into the application space? (Assuming that the area had not been mmap'd shared). The same optimization can be made on write() calls.

      The same considerations need to be applied to mmap a file, so there should be is no difference.

      In other words, read() and write() with page-alignment constraints should be the same as mmap. The difference is that re-using the same buffer may require an unmap.

      With mmap, a unmap doesn't have to match every i/o. Depending on the speed of virtual mapping vs. device i/o rate, this may be significant (or not).

      Ratboy

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    4. Re:Don't use 64-bit pointers on such systems. by Anonymous Coward · · Score: 0
      you can have multiple file handles to the file. Assume you do remap pages so the kernel's read buffer is now moved to userland (vs copying memory). If another file handle wants to read that data, the kernel needs to go to disk again (since the first buffer may have been edited).

      That seems a lot more likely to happen than the buffer being page aligned and the read-size being a page-size or greater.

    5. Re:Don't use 64-bit pointers on such systems. by Nevyn · · Score: 1
      If a read() uses a page-aligned buffer, from a page-aligned source, then why wouldn't the OS map a page directly into the application space? (Assuming that the area had not been mmap'd shared). The same optimization can be made on write() calls.

      Because the app. doesn't share the data with the OS so if the app. alters the data the OS needs to have setup COW so the data it sees is the same. And it is very rare for applications to use page aligned buffers to read or write, it is also very common to change the buffer just after calling a read or write. This makes it a bad trade off to setup COW mappings in the general case, as it hurts all the normal apps. which are using sendfile() and/or mmap() to do this kind of zero copy operation.

      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    6. Re:Don't use 64-bit pointers on such systems. by milgr · · Score: 1

      Using mmap requires the system to set up quite a few VM data structures - which takes time and space. For example, to copy a 10GB file, it would be more efficient to alternate reads with writes than to mmap() the file, followed by a bunch of writes.

      On the other hand, to randomly access bits in a file over a long time, and accross most of the file, mmap may be the way to go.

      --
      Where law ends, tyranny begins -- William Pitt
  5. Trade-offs by El · · Score: 3, Interesting

    There is always potential trade-offs between run speed and memory space. For example, you could always use a single 64-bit pointer, and save all your addresses as 32-bit or even 16-bit offsets from that pointer (requiring pointer arithmetic to access any object). Then you would use less memory, but your code would run faster.

    --

    "Freedom means freedom for everybody" -- Dick Cheney

    1. Re:Trade-offs by spectral · · Score: 1

      Damn, less memory, less bandwidth required, and my code runs faster, hot damn! yes, I realize you meant to say slower ;)

    2. Re:Trade-offs by Anonymous Coward · · Score: 0
      Let's say you use a 64-bit base pointer and a 32/16-bit offset.

      1. load 64-bit base pointer into 64-bit register 1
      2. load 32-bit pointer into 64-bit register 2
      3. and register 2 with 0x00000000ffffffff
      4. add register 1 and 2

      Hmm... the increased code size more than makes up for the couple bytes save using smaller offsets.

    3. Re:Trade-offs by El · · Score: 1

      1) That depends on how many pointers you are iterating over.
      2) If the 32-bit register is also the low 32-bits of the 64-bit register, than it's just 2 loads, not 4 instructions... granted, that's twice as many loads as using 64-bit pointers, but you have a greater chance of all data fitting into cache, so it might actually be faster.

      --

      "Freedom means freedom for everybody" -- Dick Cheney

  6. Who cares? by IshanCaspian · · Score: 0, Flamebait

    Now, this kind of stuff might be useful for...um...hard-core video editing...and really, really huge servers, but that's about it. The truth of the matter is that your everyday user just has no need to handle numbers of that size or data of those quantities. There are very few situations where 32-bit processors are actually a problem.

    --

    But there is another kind of evil that we must fear most... and that is the indifference of good men.
    1. Re:Who cares? by Smidge204 · · Score: 3, Insightful

      The truth of the matter is that your everyday user just has no need to handle numbers of that size or data of those quantities.

      Now where have we heard that before...
      =Smidge=

    2. Re:Who cares? by AWWinter · · Score: 1

      And no one will ever need more than 640K of memory...

    3. Re:Who cares? by peter · · Score: 5, Informative

      It's not that 32bit processors are a problem, it's that their virtual address space is not very big. 64bit processors can mmap anything you want, even block devices >> a few terabytes. (So if the HURD ever gets ported to AMD64, they can support filesystems > 2GiB, which they don't last I checked because they mmap the device, and the HURD only runs on i386!)

      Being able to mmap anything you want is something you just plain can't do on a 32bit CPU. If you want to write programs that don't worry about address space limitations, you need 64bit. Anything that simplifies programming is good, since programmer time is valuable.

      Besides that, even if you have 1GB of RAM on i386, Linux needs highmem support to use it all. (It reserves 3GB of virtual address space for user space, and the kernel maps as much RAM as it can with the address space that's left over after mapping PCI and AGP space. So 64bit is useful even on good desktop machines right now. (using highmem slows the kernel down, so might not even be worth it to map the last ~100MiB if you have 1GiB installed.)

      Stupid crap like highmem is exactly why we should be using 64bit CPUs.

      --
      #define X(x,y) x##y
      Peter Cordes ; e-mail: X(peter@cordes , .ca)
    4. Re:Who cares? by VojakSvejk · · Score: 0

      No one will ever need more than 640k of RAM.

    5. Re:Who cares? by Anonymous Coward · · Score: 0

      Being able to mmap anything you want is something you just plain can't do on a 32bit CPU. If you want to write programs that don't worry about address space limitations, you need 64bit. Anything that simplifies programming is good, since programmer time is valuable.

      I can't mmap anything "I" want to in 64bits. I'm guessing we need a good 1024bits to start mmaping those molecular computers that are just around the corner. Whenever you say this number is more than enough addressing to keep track of anything, you'll be wrong. Sooner or latter you'll hit that limit. Of course you may be counting and tracking the number and position of atoms in our solar system, but the point is you'd need the addressing!

    6. Re:Who cares? by ratboy666 · · Score: 1

      It's address space, along with threading that kills you the quickest...

      Here's the thing.

      If you spin off threads, each thread gets a reserved chunk of address space for its stack. It shares code and data. The stack MUST be addressable by other threads, to allow proper thread semantics for data sharing.

      If (as is typical) 1MB is reserved for the thread stack, 1000 threads will take up 1GB of address space and 4000 threads fill memory address space in a 32 bit address space.

      So, you have a fancy web server, with SMP, using multi-threading to spin connections. Say 2 threads per connection (and a few more for overhead), and you find that you cannot run more than 1500 concurrent connections. 1500 connections is not a "really, really huge server". That would be low to mid-range in my book.

      Ratboy

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    7. Re:Who cares? by arkanes · · Score: 1

      If you've got 2 threads for every concurrent connection you've got some scaling issues beyond address space, though. Especially for a web server.

    8. Re:Who cares? by IshanCaspian · · Score: 1

      Yeah, like I said, it's good for servers, but where's the end-user application?

      --

      But there is another kind of evil that we must fear most... and that is the indifference of good men.
    9. Re:Who cares? by dfeist · · Score: 1

      It seems you do not really know what "exponential" means. I bet you can map everything that exists today in digital storage with 64 bits. 16 Exabytes.
      Even your "molecular computer" "just around the corner" (that means some decades I think) will probably not have more.

      I don't think 64 bits will be the last word, but 1024 bit is definitely too much. That's about 10300 - the number of particles in the universe we know is about 100, so more than 256 bit should hardly ever be needed. I think there is even a long time to go until we'll see 128 bit computers.

      --
      Unix makes easy tasks hard and hard tasks possible. Windows makes easy tasks easy and hard tasks $29.95.
  7. Latency by andrewl6097 · · Score: 2, Interesting

    Given that memory access times are bound by latency far more than bandwidth, the effect of loading another four bytes into the register file is most likely insignificantly small. I'm certain that 8-byte register-to-register operations *are* insignificantly small, and it's likely that pointers, given that they are not large but often accessed would be kept in registers. It would depend highly on the particular architecture.

    1. Re:Latency by Zetta+Matrix · · Score: 3, Insightful

      Parent has interesting point, but it doesn't address the cache issue. 64-bit pointers will take twice as much space as 32-bit pointers. In a jump table situation, for instance, a 128-byte cache line (picking a reasonable number) could only hold 16 pointers instead of 32. Of course, as was also mentioned, when you have hardware that is designed to address more than 4 GB of memory, the amount of cache and main memory available is usually scaled up accordingly to deal with it. Bigger processor, bigger cache, more RAM, Moore's Law marches on.

      So, basically, don't worry about it... it's the price of progress. Unless you're running a 64-bit platform on a pitifully small amount of RAM.

  8. 64 bit embedded processors? by Suppafly · · Score: 2, Insightful

    Does anyone use 64 bit processors for embedded applications?

    1. Re:64 bit embedded processors? by JPelzer · · Score: 1

      Does anyone use 64 bit processors for embedded applications?
      My iToaster does, but it mainly uses it as a heating element. This thing is approximately twice as good as my old-fashioned 32-bit toaster, except it uses twice as much toast as before.

    2. Re:64 bit embedded processors? by Pseudonym · · Score: 1

      When you think "embedded", you're probably thinking of a smartcard, a pacemaker, a digital television or a fax machine. It may interest you to know that an MRI scanner is also an embedded system.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    3. Re:64 bit embedded processors? by Anonymous Coward · · Score: 0

      http://www.google.com/search?q=r4000+embedded

    4. Re:64 bit embedded processors? by p3d0 · · Score: 1

      Yes.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    5. Re:64 bit embedded processors? by Anonymous Coward · · Score: 0

      Depends. Are game consoles considered embedded systems?

  9. Implications of 64 bit pointers for interpreters by swdunlop · · Score: 4, Insightful

    There's an interesting discussion of 64-bit immediate values at the following link: 64 bit immediates in Python

    If we are already using 64 bits for our pointers, a virtual machine has the potential of exploiting a the pointer's larger footprint for other immediate values. I'm not as crazy about using the MSB of the pointer for indicating an immediate as Ian Bicking appears to be, I'd recommend using the LSB since it's easier to bias any object to an even address than halve the potential addressable space.

    Then again, if the potential address space is 2 ** 64, I suppose it's not such a sacrifice.

  10. Probably not as big a deal as you think. by Anonymous Coward · · Score: 5, Interesting

    With modern processors it's not uncommon to require 64-bit or 128-bit memory alignment on data structures to get the best performance. There are even some instructions that *require* such data alignments in order for them to work at all (for example: MMX or SIMD).

    Because of these existing data alignment issues, going from 32-bit to 64-bit pointers may have absolutely no impact on a program's memory usage and cache performance. It is highly likely you're already using 64-bit alignment when you enable the compiler's optmizations.

    Unless you're building massive linked lists of stuff in a scientific / simulation environment this is probably something not worth worrying about. The efficiency and volume of your actual data will still be the biggest waste of space - and it's not like you won't be able to attach more physical memory onto your new system than the old one.

    If it does effect you... you probably already know what you're doing or you've been making very bad assumptions about the size of your variable types.

    1. Re:Probably not as big a deal as you think. by andrewl6097 · · Score: 1

      I'm pretty sure that these days, at least on x86, data is aligned to 4-byte boundaries. Some architectures (I'm pretty sure that x86 is this way) require a 4-byte aligned address as the parameter to the 4-byte memory load instruction).

    2. Re:Probably not as big a deal as you think. by ratboy666 · · Score: 1

      No, x86 does not require data alignment for 4 byte memory load.

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    3. Re:Probably not as big a deal as you think. by ratboy666 · · Score: 2, Informative

      Of course a 64 bit pointer is 2x the size of a 32 bit pointer... 32 bit pointers only need 4 byte alignment, and thus pack nicely. So 64 bit pointers will take twice the cache space.

      And... the pointers have to be loaded. It will take more address bits in the instructions to build constants. More cache used.

      It is NOT highly likely that 64-bit alignment is done when optimizing. In fact, that's just wrong.

      Yes, cache performance suffers.

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    4. Re:Probably not as big a deal as you think. by MarkCollette · · Score: 3, Interesting

      Since no one who responded to you believes you, I thought I'd add in.

      Yes, x86 does not require alignment for the vast majority of data accesses, with pretty much the sole exceptions being SIMD instructions. And yes, that will run psychotically slower than aligning the data, which is why the compiler does it. Look into your MS VC++ optimization setting and see if it's using 4 byte or 8 byte alignment of structures by default. My goodness, it's 8 byte alignment, but why you ask? Because doubles need 8 byte alignment or else performance drops off a cliff. So don't discount alignment.

      As well, most code uses relative addressing with instructions, not absolute addressing, so don't expect all memory references to suddenly double, especially with stack caching of variables (which would be relative from the stack pointer).

      And finally, if 64 bitness makes caches be half full of zeros, look forward to chip manufacturers to include compression circuitry to alleviate that problem.

      - Mark Collette

    5. Re:Probably not as big a deal as you think. by LWATCDR · · Score: 1

      What about if you the STL collections? I would be that there are more linked lists than you might imagine in things like that. This whole thing sort of reminds me of when we went from 16 to 32-bit cpus and code. Microsoft tried to convice people that the 16 bit code in Windows and even Windows9X was a feature becuause it was faster and took up less memory.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    6. Re:Probably not as big a deal as you think. by Anonymous Coward · · Score: 0

      What if you a verb in the sentence?

      (Suitable verbs for STL: Smash, crush, obliterate... ;-)

  11. First you gush about having over 4 GB of RAM by HotNeedleOfInquiry · · Score: 2, Insightful

    Then you whine about using an extra 4 bytes per pointer to address it. Seems to me that the number of pointers relative to the amount of RAM is so small it's not an issue. Correct me if I'm wrong.

    --
    "Eve of Destruction", it's not just for old hippies anymore...
    1. Re:First you gush about having over 4 GB of RAM by arkanes · · Score: 1
      I think this would actually impact higher level languages like Java and .NET alot more than "normal" C and C++ programmers - heavily OO languages like those tend to create lots and lots of small references and probably have a higher pointer count.

      On the other hand, 64bit pointers make certain tradeoffs less desirable - for example, if you're passing around pointers to structs that are larger than 32 bits but smaller than 64, it's now more efficent to pass by value. Thats a pretty borderline case, though.... :P

  12. Cache effects by Alomex · · Score: 1

    The biggest problem of using larger pointers is not so much the extre memory used (memory is cheap). The real problem is that you consume cache space much faster so you page at a much higher rate. This can slow down your program by a factor of up to 5x.

    1. Re:Cache effects by Pseudonym · · Score: 1

      As I mentioned previously, this can be more than offset by the cost savings you get in using memory-mapped I/O. Using standard POSIX I/O, your data hits memory twice.

      Oh, and 64-bit CPUs tend to have larger cache lines to cope.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    2. Re:Cache effects by Alomex · · Score: 1


      Forget about I/O I'm talking about moving code from RAM to Level 1 cache.

    3. Re:Cache effects by Pseudonym · · Score: 1

      ...and I'm saying that I/O can easily dominate cache. This is especially true when you consider that copying a few disk pages from one physical memory location to another could easily trash the contents of your L1 cache.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    4. Re:Cache effects by Alomex · · Score: 1

      Our app was heavily optimized to minimize I/O transfers, so that was not much of a concern once the code was loaded (initial load time was substantially slower, though). Cache effects were significant.

    5. Re:Cache effects by Pseudonym · · Score: 1

      Fair enough. I hack a certain high-performance database server for a living. I/O often dominates our applications, so we really care about memory-mapped I/O. As a result, we often find ourselves scrounging address space on "large" databases. Maybe our domain is more sensitive about it than yours is.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    6. Re:Cache effects by spongman · · Score: 1

      that's true, but you don't need to mmap() the whole file at once. you can easily emulate a paged version of read() by mmapping regions of the file you're interested and avoid running into any address-space limitations.

    7. Re:Cache effects by arkanes · · Score: 1

      And thats also true, but it comes at a performance and maintainability cost (the codes more complex and therefore more bug prone, you've got the overhead of maintaing the page file, etc, etc, etc). It's like saying you can emulate a 32bit address space with 16bit pointers, which you can, but it's hardly preferable to having a flat 32bit address space.

    8. Re:Cache effects by Ninja+Programmer · · Score: 1

      The cache consumption is no worse than doubled. AMD has doubled the on chip L2 cache in moving from Athlon to Athlon64/Opteron. 'Nuff said?

  13. sparc64 by keesh · · Score: 3, Informative

    With linux on sparc64, typical applications are 30% slower when running in 64bit userland mode as opposed to 32bit userland mode. There are of course exceptions...

    1. Re:sparc64 by Frequanaut · · Score: 3, Insightful

      wha??? Any linkage to back this up?

    2. Re:sparc64 by Iscariot_ · · Score: 1

      wha??? Any linkage to back this up?

      Doubtful, his name is keesh afterall... He probably forgot. Oh well.

  14. IA64 programming by Twister002 · · Score: 4, Informative

    Raymond Chens web log. Lately he's been discussing IA 64 programming. I don't pretend to understand 1/2 of what he's talking about but I thought some of the readers here might be interested in what he has to say.

    --
    "For a successful technology, honesty must take precedence over public relations for nature cannot be fooled." -Feynman
  15. ...in their keyboard... by leonbrooks · · Score: 1

    ...this year, anyway. (-:

    --
    Got time? Spend some of it coding or testing
  16. It's baaack by fm6 · · Score: 1

    My God. The kludge that would not die! I thought we did away with memory models when we finally got rid of protected mode. But nooo. People still want to squeeze a few more bits out of their memory systems. Somebody call an exorcist!

    1. Re:It's baaack by Anonymous Coward · · Score: 0

      I think you will find that protected mode is what is currently used by modern operating systems on x86 architectures. It is not a kludge, but rather a way of 'protecting' processes from each other

  17. 2 comments by wayne606 · · Score: 2, Interesting

    First of all there is no such thing as a typical program... If you are writing a lisp interpreter where everything is a pointer then you may see your memory usage almost double. If you have a numerical program that is dominated by huge arrays of floats you might not see any difference at all.

    Second, here is a trick I have seen - it seems a bit strange but works well if you encapsulate your data well. Keep in mind that objects are generally aligned to a 8-byte boundary (if they are malloc'ed). That means your low 3 bits are not used at all. If your objects have, say, 64 bytes of data in them (possibly after a bit of padding) then you are wasting 6 bits. Just store your pointers as 32-bit words, shifted over by 6 bits. When you want to dereference them, your get-the-pointer accessor function just shifts them back and gives you a 64-bit pointer.

    Now you have an effective address space of 256GB and your data size has not grown at all. Maybe you have taken a hit in performance but until you benchmark you never know...

    1. Re:2 comments by Anonymous Coward · · Score: 0

      You could just put them in a sparse array and let the compiler do the work.

  18. Video editing will become more widespread by tepples · · Score: 2, Insightful

    Now, this kind of stuff might be useful for...um...hard-core video editing...and really, really huge servers, but that's about it. The truth of the matter is that your everyday user just has no need to handle numbers of that size or data of those quantities.

    What happens when "your everyday user" wants to perform "hard-core video editing" on footage she shot of her family with her miniDV camcorder?

  19. The compact memory model by tepples · · Score: 1

    Usually, code in a given process won't fill more than 4 GB. In a jump table situation, instruction pointers can be 32-bit while data pointers are 64-bit, in a memory model resembling the "compact" memory model of old 16-bit Borland C++ compilers.

    Unless you're running a 64-bit platform on a pitifully small amount of RAM.

    Three words: PDA.

    1. Re:The compact memory model by BdosError · · Score: 1
      Unless you're running a 64-bit platform on a pitifully small amount of RAM.

      Three words: PDA.

      No-one:

      1. is currently using 64bit processors in PDAs
      2. would need 64bit pointers if they only had small memory (unless they were mapping a large storage device -- not something they do in PDAs).
      --
      Complexity is Easy. Simplicity is Hard.
    2. Re:The compact memory model by tepples · · Score: 1

      No-one ... would need 64bit pointers if they only had small memory

      Or unless they were trying to "Run the same apps that your desktop computer runs" as promised by the PDA's advertising. Oh wait; I had forgotten that an IBM-compatible PDA is called a "tablet" not a "PDA".

    3. Re:The compact memory model by arkanes · · Score: 1

      I've never seen a PDA that advertised this. Tablet PCs are a different market and don't have anything in common with PDAs. And even if you're going to call them a PDA, a "PDA" thats binary compatible with a desktop OS isn't a memory limited environment - they're the same hardware as laptops.

  20. not necessarily by Anonymous Coward · · Score: 0

    You will only trash the caches if you use (read: address) the entire 64bit space, not if you have 64 bit or 128 bit pointers. That should be obvious since caches work in contiguous chunks of memory, and so long as you stay in the same chunk, it doesn't matter, from the caching point of view, that you used 28 or 60 bits or whatever for its high-order address. The cache will have to use quite a bit more memory for storing the page addresses though, but that doesn't cause any trashing as it's done in the design stage.

    Bandwidth from memory to cache won't matter either because the address bus will just have to be wider.

    1. Re:not necessarily by aurum42 · · Score: 1

      I believe the post was referring to the effect of actually storing the pointer values in the d-cache, not the efficiency of the cache hardware. Instead of storing say x 32-bit pointer values in a cache line, you can now only store x/2 pointer values.

      --
      "The slave who knows his master's will and does not get ready...will be be beaten with many blows."Luke 12:47-48
  21. really !? by Anonymous Coward · · Score: 1, Interesting

    Care to document the above statement ? I believe it to be between very inaccurate to downright false, and here's why:

    Using more bits for addressing does not do anything to the cache's data memory: that memory works in CPU words and on a 64 bit system those are already 64 bit. Cache lookup tables, on the other hand (where it stores the high-order address bits for each cached block) will have to accomodate the additional bits, but that is done on the drawing board when designing the cache and it is a fixed amount dependent on the number of blocks in the data part of the cache memory.

    Also, paging has nothing to do with caching. Paging is a memory virtualization mechanism the CPU uses. Caching works in blocks of data that are almost never the size of a CPU page, they are "lines" of 16, 64 or whatever bytes.

    1. Re:really !? by Alomex · · Score: 1

      Using more bits for addressing does not do anything to the cache's data memory: that memory works in CPU words and on a 64 bit system those are already 64 bit.

      Sure. The comparison is between 32 bit and 64 bit words, including but not necessarily limited to addressing.

      paging has nothing to do with caching.

      Have you ever heard the term "paging to disk"? Do you know what it means?

    2. Re:really !? by Anonymous Coward · · Score: 0

      Hmm. The article specifically states the question is about using 64-bit over 32-bit pointers in 64-bit processors. If it were talking about a move from 32 bit to 64 bit in general, your argument would work better.

      Paging to disk is swapping. If your machine is swapping heavy enough to affect cache you already have worse problems to deal with and most probably ran out of memory. Swapping does not happen all that often on a normal machine, and it does not happen at all unless a certain allocated memory threshold is reached.

  22. Alpha by Tune · · Score: 2, Insightful


    I concur with your findings. Back in the days I was experiencing a little disconfort with the speed of my Pentium 90 running linux, I decided to buy a Digital Alpha system 266 MHz. Both systems were configured with 64 MB, and both ran Red Hat 5.2.

    Although the Alpha system is obviously superior in number crunching, I noticed it ran out of physical memory on a regular basis where my P90 whould still be happy. Part of the matter it that alpha binaries tended to be much larger, as was the kernel. But I'm also quite sure that a major part is primarily due to the increased amount of "lost" bits in pointers and memory alignments of small data structures.

    --
    The problem with engineers is that they tend to cheat in order to get results. The problem with mathematicians is that they tend to work on toy problems in order to get results. The problem with program verifiers is that they tend to cheat at toy problems in order to get results.

    1. Re:Alpha by Anonymous Coward · · Score: 0

      Back when Microsoft was selling boxed software for the Alpha (which I'll admit was a loooong time ago), the memory requirements for said software was usually 50-100% higher than that for the x86 version. Whether that was due to them shipping unoptimized debug builds or the 64-bitness of the platform I can't say. Probably equal amounts of each ;)

    2. Re:Alpha by LordHunter317 · · Score: 1

      No, this was because Linux/Alpha at that point didn't have an ld.so. Everything was statically compiled, which is where the size increase came from.

  23. I use 64-bits as a timestamped pointer. by torpor · · Score: 1

    First 32-bit mantissa is a timestamp, second is the pointer.

    Its nice to have a pointer with time. This makes for some interesting algorithms...

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    1. Re:I use 64-bits as a timestamped pointer. by Anonymous Coward · · Score: 0

      Or you can use checksum + pointer!

    2. Re:I use 64-bits as a timestamped pointer. by turgid · · Score: 1

      You are a liar, and a lunatic, but the kind of lunatic I like.

    3. Re:I use 64-bits as a timestamped pointer. by torpor · · Score: 1

      i'm not a liar, its true, i do use them!

      and well, it is true that i am a lunatic however ...

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    4. Re:I use 64-bits as a timestamped pointer. by turgid · · Score: 1

      Then you truly are a lunatic.

  24. And segments...? by StarBar · · Score: 2, Informative

    On CPU:s with segments the impact must be much less if even at all. Say for instance that you reside in a 32 bit segment X and 16 bit subsegment Y then you would use 16 bit storage of pointers in RAM even though the CPU constructs the full 64 bit pointer internally by concatenating all the parts from the segment registers with the 16 bit from RAM.

    I don't assume any CPU in particular just the principle of segments.

  25. Answer: yes by p3d0 · · Score: 2, Interesting
    A while back, I was looking into more efficient heap storage of Java objects, and found that the heap of a variety of Java programs consist of about half pointers and half ints. The next most common type was booleans, and they were under 1%. Everything else was vanishingly small.

    Thus, you can expect Java heaps to expand by about 50% when moving from 32-bit to 64-bit pointers. What effect this has on your program's performance depends on the relation between the program's resident sets and the machine's cache. For instance, if your program has a resident set of 200KB on a machine with a 256KB cache, then the extra 50% will blow the cache and kill your performance. If the resident set were 150KB, the performance impact would probably be minimal.

    Disclaimer: I was doing this as a pet project in my spare time, so take these numbers with a grain of salt.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  26. 64 bit pointers on embedded platforms? by polyp2000 · · Score: 2, Interesting

    >This is especially an issue on embedded platforms where RAM is at a premium... What kinds of embedded platforms are likely to be needing greater than 4gb RAM anyhow? I sure as hell cant imagine a use for a 64bit washing machine with upwards of 4gigs .. Thats a hell of a lot of washing programmes.

    --
    Electronic Music Made Using Linux http://soundcloud.com/polyp
    1. Re:64 bit pointers on embedded platforms? by Anonymous Coward · · Score: 0

      Within a year or two someone will be able to hack linux into an embedded processor on a modified former mp3 player with greater than 4 GB of memory. Setting it all up as slow RAM instead of disk is feasiable.

      And someone will come up with a use for that also.

  27. Dopplers, Too by mosel-saar-ruwer · · Score: 0

    There's a lot of modern medical equipment which can definitely use the 4GB. MRI machines, CT scanners, ultrasound machines ("sonographs" if you prefer the term) and so on do tend to chew up memory. Particularly the first two, because you often need to hold whole voxel sets in memory while you compute a bunch of cross-sections at odd angles.

    We're about to up our Doppler sampling rate to

    (3 channels) X (24 bit samples stored as 8 byte doubles) X (125 K Samples/second) = 3MB/sec
    Since it takes the technician a good ten minutes or more to find the signal, we're looking at
    10 X 60 X 3MB = 1.8GB
    without batting an eyelash.

    Granted, that's not in the same league as the three dimensional stuff, but it ain't exactly peanuts, either.

    PS: I haven't done the math yet, but if 8 byte doubles don't give us sufficient granularity to store 24-bit samples, we may need to up our storage to 12 byte [96 bit] or even 16 byte [128 bit] doubles.

    Where's that IEEE standard when you need it?

  28. Who will ever need more than 4 GB of memory??? by Anonymous Coward · · Score: 0

    Isn't it unfair that anyone can say something like "what kind of embedded systems need more than 4GB online simultaneously" off the top of their head, but when Bill Gates says something like "640K is more than you'll ever need", it's taken so seriously and used to make him look bad?

    Yes, today 4 GB is a hell of a lot, but 20 years from now, I'll bet that everything will have gigs and gigs of memory, and then your statement will look as stupid as Bill Gates' did back in the 80s.

  29. Seriously OT by now... by ratboy666 · · Score: 1

    And mmap() can have multiple mappings of the file -- how do you think that's handled? It's the same thing. Why should the kernel have to go to disk again? The map consistency has to be there already, to handle mmap(). COW if its private, otherwise, share. If the read()/write() buffer is not aligned, you do need to copy the data -- as if you are a user of mmap(). Big deal, the optimization is "lost".

    Still, the easiest way to handle this is to always mmap() files, and read/write will either (a) be replaced by mmap(), or (b) do a copy. If the kernel lives in 64 bit land, the application can still live in 32 bit land for better cache handling, and the optimization works. If the kernel is 32 bit, mmap() files always doesn't work, and there are problems... (HURD), or some semi-fancy filesystem code.

    As to WHY? The same code can run at near-optimal speed (mmap-ish), and STILL use read/write for portability to other environments. If I write mmap() based code, I have to worry about alignment, AND have to worry about porting to read/write. If I use read/write, and give page-alignment, the OS can optimize if it is able.

    Do YOU want to do the lifting (porting) or leave it somewone else?

    Ratboy

    --
    Just another "Cubible(sic) Joe" 2 17 3061
  30. Minor increase in memory use by isj · · Score: 2, Informative

    A few years back I did a test with a server which store state information (I will not bore you with the details). I did some performance test on both the 32-bit version and the 64-bit version. Same source code. Same test data. Same configuration. On HP-UX 11.0 PA-RISC with the aCC compiler.
    The 64-bit version used about 15% more memory than the 32-bit version. But it was also 20% percent faster. That still puzzles me, because the server does not perform any 64-bit operations.

    1. Re:Minor increase in memory use by mean+pun · · Score: 1
      The 64-bit version used about 15% more memory than the 32-bit version. But it was also 20% percent faster. That still puzzles me, because the server does not perform any 64-bit operations.

      One posibility is that the 64-bit version had better instruction-cache behaviour, because fewer `hot' code segments were fighting for the same cache lines.

      Or perhaps the equivalent in the data cache.

    2. Re:Minor increase in memory use by isj · · Score: 1

      True. I did not try to run it through a feedback profiler and then rearranging the code to keep the hot code together. That could have been interesting.
      I have another theory: the compiler could assume 64-bit operations were safe and did them behind my back for copying data. Or maybe the heap manager is vastly more efficient in 64-bit mode than in 32-bit.
      I remember the watcom compiler doing similar tricks in 16-bit mode with inlining structure-copying by setting up SI and DI registers an doing 32-bit moves if it knew it was safe.

  31. In Java, you use pointers all the time by Anonymous Coward · · Score: 1, Informative

    They usually call them "references", but names like NullPointerException give the game away...

    Yes, it's true that Java's pointers don't behave quite the same way as C's pointers, but then they don't behave like C++'s references either. It's a different language.

    If you meant that you don't do pointer arithmetic, you'd be right, there's none of that available to users in Java, and that's mostly a good thing.

    Shame your Java apps use 16-bit characters, when Unicode needs more... maybe you could switch to a language better suited to modern tasks such as I18N. C or C++ might handle that need just fine, with wchar_t typically being 32-bits.

  32. 64-bits does not imply "memory to spare" by jdennett · · Score: 1

    If you've moved to a 64-bit platform, it's often because you need access to every bit of memory and performance you can get. Increasing cache misses and filling your memory by having larger datasets because of longer wordsizes isn't a trade-off to be taken without careful consideration.

    On the other hand, most of the systems I know that use >= 4GB of RAM are databases, and pointers make up a tiny fraction of the memory footprint.

  33. Why not use 16 bit code, then? by gillbates · · Score: 4, Informative

    Seriously, it is faster. I've been writing in assembly for years, and unless I need a 32 bit pointer, I generally don't use them.

    If you're that concerned about performance that you are analysing pointer size, you might as well code in assembly. Yes, 64 bit pointers have a bigger footprint, but we experienced the same problem when we went to unicode strings, 32 bit code, etc...

    My advice is this: let the compiler deal with it. Unless you are willing to crank out a lot of hand-coded assembly or are interfacing with hardware, the 32/64 bit pointer question is pretty much moot. As it is, you can't control:

    • Where your linker places segments in the loaded image. Trust me, this is a big source of cache misses on the older processors where the libraries were in one area of memory and the running code in another.
    • The optimal ordering of instructions to keep the U and V pipelines of the processor filled. Some of the modern compilers can do this pretty well, but you can never be too sure. The number of clock cycles an instruction takes can vary by a factor of 3, so unless you're willing to learn some pretty hardcore assembly, you're stuck with whatever the compiler gives you.
    • The instruction level optimization of the compiler. Intel's new C++ compiler will turn the familiar array initialization code:

      for (int x = 0; x < 256; x++)buffer[x] = 0;

      Into something like this:

      mov cx,64
      mov eax,0
      mov si,buffer
      cld
      rep stosd


      Instead of the literal translations of the old compilers:

      mov si,buffer
      mov bx,0 ; this is the x variable
      forlabel@10001:
      mov [bx + si],0
      mov ax,1
      add ax,bx
      xchg bx,ax
      cmp bx,256
      jl forlabel@10001


      The former takes 68 instruction cycles, the later takes (6 * 256 + 2) = 1576!

    The aforementioned issues have a much bigger impact on performance than pointer size. Given that the memory bus is at least 64 bits wide on anything newer than a pentium, you won't incur a clock cycle penalty for using 64 bit pointers.

    The only thing that I would suggest is to watch where you place pointers in structures. For example, when building a linked list, you would want to do something like this:
    class link {
    link * ptrforward;
    link * ptrbackward;
    link * ptrdata;
    }
    rather than:
    class link{
    link * ptrdata;
    link * ptrbackward;
    link * ptrforward;
    }
    Because the processor pulls 64 bits per address accessed, the former structure would have the forward pointer in cache regardless of the pointer size. With the second structure, traversing a list in the forward direction would result in a cache miss on every node visited, regardless of pointer size (This applies only to the x86...).

    My experience has been that pointer size is only relevant on truly tiny systems - for example, 16 bit code which has to fit into a few kilobytes. Usually, as programs scale to work with larger datasets, the percentage of memory used for pointers decreases rapidly. You'll find that as data sizes increase, the practical uses for linked structures shrink; locating an element by using a binary search on a sorted array scales much better than a linear search traversing linked list.

    --
    The society for a thought-free internet welcomes you.
    1. Re:Why not use 16 bit code, then? by happyfrogcow · · Score: 1


      class link {
      link * ptrforward;
      link * ptrbackward;
      link * ptrdata;
      }
      rather than:
      class link{
      link * ptrdata;
      link * ptrbackward;
      link * ptrforward;
      }
      Because the processor pulls 64 bits per address accessed, the former structure would have the forward pointer in cache regardless of the pointer size. With the second structure, traversing a list in the forward direction would result in a cache miss on every node visited, regardless of pointer size (This applies only to the x86...).


      I have just a question really. You're saying the first is better (less cache misses), based on the assumption that you will be using ptrforward more often than any of the other pointers?

    2. Re:Why not use 16 bit code, then? by gillbates · · Score: 1
      That is generally the assumption. Usually, the most time-consuming algorithms process lists from front to back - i.e., searching, sorting, popping from a stack, etc...

      Past the Pentium 2, the bus width went to 64 bits, so in this case, both pointers would be in cache if you were using a 32 bit system. If you're building for AMD's Opteron, you won't experience a performance hit when going 64 bit, because the Opteron's bus is 144 bits. However, the Itanium's bus is only 64 bits, so you might experience a performance decrease, assuming that you needed to access the backward pointer. Either way, it won't be that much of a difference, because as I stated previously, pointers are generally the least useful in time-sensitive or constrained memory algorithms. If you've got a program which needs more than 4GB of memory, you probably won't be using linked lists or other pointer-intensive structures simply due to the fact that these data structures aren't efficient enough to process this volume of data in a reasonable amount of time.

      --
      The society for a thought-free internet welcomes you.
    3. Re:Why not use 16 bit code, then? by Anonymous Coward · · Score: 0

      U and V pipes? Are you still writing code for the original Pentium?

      Just let the compiler order instructions, since the ordering rules are different for different CPU generations. Far easier to just recompile for a new CPU.

  34. And we thought 64 bits would solve the 2038 issue by Anonymous Coward · · Score: 0

    But with this approach we can have 64-bit machines and still have problems when a 32-bit time_t wraps around.

  35. Re:Implications of 64 bit pointers for interpreter by Ninja+Programmer · · Score: 1
    I'm not as crazy about using the MSB of the pointer for indicating an immediate as Ian Bicking appears to be, I'd recommend using the LSB since it's easier to bias any object to an even address than halve the potential addressable space.
    AMD (you know the guys who made x86-64) are NOT fans of these kinds of ideas. If you scribble in undefined places in the pointer, the Opteron/Athlon64 will throw an exception. Pointers in x86-64 are signed extended, so its not trivial to hide stuff in upper bits and then fix the pointer back.
  36. Quick Quiz by Cmdr+TECO · · Score: 1

    "Most 64-bit processors provide a 32-bit mode for compatibility"

    One free mod point for the first correct answer: Name a 64-bit processor with a 60-bit mode for compatibility.

    --
    echo 33676832766569823265328479713269.8639857989Pq | dc
    1. Re:Quick Quiz by tbakker · · Score: 2, Informative

      Cyber 180

  37. Re:Implications of 64 bit pointers for interpreter by be-fan · · Score: 1

    That blows. There is so much cool stuff you can do with 64-bit pointers, because nobody really neads more than about 45 of those bits.

    --
    A deep unwavering belief is a sure sign you're missing something...
  38. Mo' memory, mo' memory, mo' memory! by Anonymous Coward · · Score: 0

    With 64-bit processors, I'll have enough memory to rule the world! Mwah ha ha ha!!

  39. 64-bit versus 32-bit metrics by tokki · · Score: 1
    While not using applications that are specifically written for 64-bit (I know of no open source application that is specifically written with 64-bit in mind) and testing limited (all benchmarks are inherintly limited), these are a start, and represent more than just the pure conjecture which seems to be pervasive. Of course these are test on Solaris 9 for SPARC, so other platforms will likely vary.

    64-bit versus 32-bit

  40. Name it? by AuraSeer · · Score: 1

    I think I'll call it "Bob."

  41. Dumb question by EelBait · · Score: 1

    This is a dumb question. Do you really think that on a system with more than 4GB of memory that memory would be at such a premium that an additional four bytes per pointer would even be noticeable? Surely you jest!

    1. Re:Dumb question by localhost00 · · Score: 1

      I don't think 4 bytes wil be all that noticeable. And don't call me Shirley.

      --

      Calling atheism and agnosticism a religion is like calling bald a hair color.

    2. Re:Dumb question by EelBait · · Score: 1

      So I slap you and call you what? Susan?

  42. Re:Implications of 64 bit pointers for interpreter by Anonymous Coward · · Score: 0

    20 years from now you'll be happy AMD prevented billions of lines of code that relied on such hacks.