Effect of Using 64-bit Pointers?

← Back to Stories (view on slashdot.org)

Effect of Using 64-bit Pointers?

Posted by Cliff on Wednesday January 21, 2004 @01:30PM from the technical-changes-and-internal-effects dept.

An anonymous reader queries: "Most 64-bit processors provide a 32-bit mode for compatibility, but 64-bit pointers are becoming essential as systems move beyond 4GB of RAM. Also, the large virtual address space is very useful for several reasons - allowing large files to be memory-mapped, and allowing pages of memory to be remapped without ever requiring the virtual address space to be defragmented. However, 64-bit pointers take up twice as much memory, which immediately affects memory footprint. This is especially an issue on embedded platforms where RAM is at a premium, but even on systems where RAM is plentiful and cheap the extra memory footprint reduces cache performance. Have Slashdot readers done any research into the actual effect of using 64-bit pointers in a 'typical' application? What proportion of a real program's data is actually pointers?"

14 of 164 comments (clear)

Min score:

Reason:

Sort:

Embedded platforms?!? by El · 2004-01-21 13:38 · Score: 3, Interesting

How many embedded devices are running 64-bit processors now? Offhand, I'd say this is only a problem if you have an embedded device with more than 4 GBytes of memory... in other words, it hardly sounds like a real-world problem for embedded devices. Yes, workstations and servers with 64-bit processors should probably be using 64-bit pointers.

--
"Freedom means freedom for everybody" -- Dick Cheney
1. Re:Embedded platforms?!? by davetm · 2004-01-21 14:26 · Score: 2, Interesting
  
  Personally, both types of embedded device I've worked on have been 32 bit. The first was a database engine (think network attached storage^H^H^H^H^H^H^Hdatabase of several Terabyte dataset size), and the second is set top boxes for digital tv. In the case of the first I can immediately see the need for 64bit arithmetic AND addressing. In the case of the set top box I think 32 bits will be fine for a while yet; there is pressure for faster processors, but not for 64 bit arithmetic.
  
  --
  -- Dave
2. Re:Embedded platforms?!? by addaon · 2004-01-21 14:48 · Score: 3, Interesting
  
  Many 32 bit platforms, including x86, PowerPC, etc, support 64GB of ram... but only 4GB of address space. Most people want more than 4GB of address space, but don't yet care about more than 4GB of ram.
  
  --
  
  I've had this sig for three days.
3. Re:Embedded platforms?!? by cant_get_a_good_nick · 2004-01-22 09:04 · Score: 2, Interesting
  
  The HURD code to access disks uses mmap() calls, so is currently limited on 32 bit architecture to 2GB disks. Every partition has to be less than 2GB, which is a pain in the ass for todays >100GB drives.
Trade-offs by El · 2004-01-21 13:51 · Score: 3, Interesting

There is always potential trade-offs between run speed and memory space. For example, you could always use a single 64-bit pointer, and save all your addresses as 32-bit or even 16-bit offsets from that pointer (requiring pointer arithmetic to access any object). Then you would use less memory, but your code would run faster.

--
"Freedom means freedom for everybody" -- Dick Cheney
Latency by andrewl6097 · 2004-01-21 13:56 · Score: 2, Interesting

Given that memory access times are bound by latency far more than bandwidth, the effect of loading another four bytes into the register file is most likely insignificantly small. I'm certain that 8-byte register-to-register operations *are* insignificantly small, and it's likely that pointers, given that they are not large but often accessed would be kept in registers. It would depend highly on the particular architecture.
Probably not as big a deal as you think. by Anonymous Coward · 2004-01-21 14:02 · Score: 5, Interesting

With modern processors it's not uncommon to require 64-bit or 128-bit memory alignment on data structures to get the best performance. There are even some instructions that *require* such data alignments in order for them to work at all (for example: MMX or SIMD).

Because of these existing data alignment issues, going from 32-bit to 64-bit pointers may have absolutely no impact on a program's memory usage and cache performance. It is highly likely you're already using 64-bit alignment when you enable the compiler's optmizations.

Unless you're building massive linked lists of stuff in a scientific / simulation environment this is probably something not worth worrying about. The efficiency and volume of your actual data will still be the biggest waste of space - and it's not like you won't be able to attach more physical memory onto your new system than the old one.

If it does effect you... you probably already know what you're doing or you've been making very bad assumptions about the size of your variable types.
1. Re:Probably not as big a deal as you think. by MarkCollette · 2004-01-22 06:54 · Score: 3, Interesting
  
  Since no one who responded to you believes you, I thought I'd add in.
  
  Yes, x86 does not require alignment for the vast majority of data accesses, with pretty much the sole exceptions being SIMD instructions. And yes, that will run psychotically slower than aligning the data, which is why the compiler does it. Look into your MS VC++ optimization setting and see if it's using 4 byte or 8 byte alignment of structures by default. My goodness, it's 8 byte alignment, but why you ask? Because doubles need 8 byte alignment or else performance drops off a cliff. So don't discount alignment.
  
  As well, most code uses relative addressing with instructions, not absolute addressing, so don't expect all memory references to suddenly double, especially with stack caching of variables (which would be relative from the stack pointer).
  
  And finally, if 64 bitness makes caches be half full of zeros, look forward to chip manufacturers to include compression circuitry to alleviate that problem.
  
  - Mark Collette
2 comments by wayne606 · 2004-01-21 17:31 · Score: 2, Interesting

First of all there is no such thing as a typical program... If you are writing a lisp interpreter where everything is a pointer then you may see your memory usage almost double. If you have a numerical program that is dominated by huge arrays of floats you might not see any difference at all.

Second, here is a trick I have seen - it seems a bit strange but works well if you encapsulate your data well. Keep in mind that objects are generally aligned to a 8-byte boundary (if they are malloc'ed). That means your low 3 bits are not used at all. If your objects have, say, 64 bytes of data in them (possibly after a bit of padding) then you are wasting 6 bits. Just store your pointers as 32-bit words, shifted over by 6 bits. When you want to dereference them, your get-the-pointer accessor function just shifts them back and gives you a 64-bit pointer.

Now you have an effective address space of 256GB and your data size has not grown at all. Maybe you have taken a hit in performance but until you benchmark you never know...
really !? by Anonymous Coward · 2004-01-21 19:36 · Score: 1, Interesting

Care to document the above statement ? I believe it to be between very inaccurate to downright false, and here's why:

Using more bits for addressing does not do anything to the cache's data memory: that memory works in CPU words and on a 64 bit system those are already 64 bit. Cache lookup tables, on the other hand (where it stores the high-order address bits for each cached block) will have to accomodate the additional bits, but that is done on the drawing board when designing the cache and it is a fixed amount dependent on the number of blocks in the data part of the cache memory.

Also, paging has nothing to do with caching. Paging is a memory virtualization mechanism the CPU uses. Caching works in blocks of data that are almost never the size of a CPU page, they are "lines" of 16, 64 or whatever bytes.
Re:Embedded 64-Bit by Scherf · 2004-01-21 22:57 · Score: 2, Interesting

OTOH, other than disk controller caches (?), what kind of embedded systems need more than 4GB online simultaneously ?

Some CAD Programms used in Mechanical Engineering (CATIA V5 for example) could use that much. Loading a whole car engine into one of these Programms will exceed 4 GB pretty quickly.
Answer: yes by p3d0 · 2004-01-22 02:11 · Score: 2, Interesting

A while back, I was looking into more efficient heap storage of Java objects, and found that the heap of a variety of Java programs consist of about half pointers and half ints. The next most common type was booleans, and they were under 1%. Everything else was vanishingly small.
Thus, you can expect Java heaps to expand by about 50% when moving from 32-bit to 64-bit pointers. What effect this has on your program's performance depends on the relation between the program's resident sets and the machine's cache. For instance, if your program has a resident set of 200KB on a machine with a 256KB cache, then the extra 50% will blow the cache and kill your performance. If the resident set were 150KB, the performance impact would probably be minimal.
Disclaimer: I was doing this as a pet project in my spare time, so take these numbers with a grain of salt.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
64 bit pointers on embedded platforms? by polyp2000 · 2004-01-22 04:06 · Score: 2, Interesting

>This is especially an issue on embedded platforms where RAM is at a premium... What kinds of embedded platforms are likely to be needing greater than 4gb RAM anyhow? I sure as hell cant imagine a use for a 64bit washing machine with upwards of 4gigs .. Thats a hell of a lot of washing programmes.

--
Electronic Music Made Using Linux http://soundcloud.com/polyp
Re:Embedded 64-Bit by j3110 · 2004-01-22 05:49 · Score: 3, Interesting

I think the cache arguement is complete BS. It appears it would be true, but not really. Most pointers are created and controlled by the compiler, and they are going to be relative 90% of the time. That's why relative addressing was invented. So, you get an extra 4 bytes stuffed into your cache on the relatively rare occurance that one of your 8 byte pointers are being used. In this 8 byte pointer, I'm just going to assume that you aren't idiot enough to be accessing memory in the same page most of the time. I really think that the page switching of the RAM to access the data at the other end of the pointer is going to be the greatest overhead. Normal cache misses in a 64bit addressing scheme should be exponentially more than a 32bit, if you really needed the 64bit.

So while you may have a caching problem, I think it's going to be because of accessing more data rather than the 4 bytes extra on some pointers.

Now if you're using disk based data structures, you better be using 64bit. I could make an exception if you used a 32bit number to address the cluster, then a 16bit number to access the actual data in the cluster, if required. A good DB server would do well to use 32bit cluster numbers to save index size, then scan the loaded cluster for the record. AFAIK, no one has been clever enough to do this, but I'm not privy to the internal structures of a lot of DBMSs. And this would matter a lot, because you could fit much more of the index into memory, and have much less data to read on the drive. Throwing away CPU cycles and memory for more compact disk data is a common practice.

--
Karma Clown