What Improvements Will 64-Bit Processors Bring?

← Back to Stories (view on slashdot.org)

What Improvements Will 64-Bit Processors Bring?

Posted by Cliff on Tuesday December 4, 2001 @10:47AM from the kibbles-n-bits-and-more-bits dept.

RyanG asks: "Everyone always looks at numbers (MHz, RAM, HD) when they're considering buying a new computer. Recently, more users have been eyeing bits, as in 64-bit processors, namely the Itanium and to a lesser extent the G5. A lot of people remember the performance increases that were seen when moving from 16 to 32-bit processors and some people seem to think similar performance increases will be realized when moving from 32 to 64-bit pocessors. From what I've read this isn't going to be the case given that 64-bit percision isn't needed in all but a few cases and that moving around that extra data can actually hurt the performance of 64-bit processors when compared to 32-bit processors. Anyone care to comment?"

7 of 69 comments (clear)

Min score:

Reason:

Sort:

Why 64 bits isn't the big deal 32 bits was by polymath69 · 2001-12-04 11:09 · Score: 4, Informative

Back when we had 16 bit computers, this created limits that we had to actively work around, all the time. 64K was a hard limit all over the place.
On older machines, this was either an absolute hard limit (64K, period) or kludged in some way (Apple //c had a special bit to bankswitch in one or the other 64K memories, but both couldn't be used at the same time.)
The IBM PC had its segment registers and so could address 1MB, but it was far from transparent. There was no way to declare, say, a 200K array of strings. The programmer's data structures had to be tied quite closely to the peculiarities of the architecture. We spent as much time working around the limitations as we did writing useful code.
When 32 bit computing came along, bam! What a change! Want to declare a 20MB data structure? Go for it! In terms of articifial restrictions, there just weren't any practical limits to run into, or around, day in and day out.
The reason 64 bits won't be as revolutionary as 32 bits is that, for the most part, 32 bits is still good enough. Even with the bloated software we have these days, 4 billion is still plenty when it comes down to most things. Take time_t; that's still not going to overflow for another 30 years. 4 billion is a lot.
A 64-bit CPU working with 32-bit data is being slightly inefficent, but don't worry too much about a slowdown from that, as they'll tend to be inherently faster over time, which should more than make up for it.
So, basically, you heard right. I think.

--

--
I don't want to rule the world... I just want to be in charge of mayonnaise.
It's about size, not speed by "Zow" · 2001-12-04 11:35 · Score: 5, Informative

This makes question #11 on my Architecture midterm today. . .

The jump from 32 to 64 bits isn't about speed or precision, it's about the amount of useable address space on a given architecture. For whatever reason (call it functionality, call it bloat, whatever), the amount of address space that programs require is going up by .5 to 1 bit per year. Have you noticed that a lot of people are starting to complain that their PC's are maxed out at 4GB, especially for things like heavyweight apps like db servers, simulation programs or MSWord? Or that there's been a lot of work on Linux or NT to allow the user to access more of the 4GB on the box? Guess what? The 80386 came out 16 years ago.

So the jump now is mostly to allow us to continue to grow for another 32 years. Most processor manufactures tried to get the migration started early - the SPARC, MIPS and Power(PC) chips have all supported 64-bit operation for some time now. The Alpha was origionally designed as a 64-bit processor 10 years ago. Intel and AMD are actually rather late to the game.

It's been said that the only thing that killed the PDP-11 from DEC was its small (16-bit) address space - Users were very happy with it, but when they needed more room for their programs, the PDP just couldn't be expanded to handle them. This is probably why DEC started migrating everyone to the Alpha 10 years ago. The origional release of the Alpha only used a 34-bit address path (so it could access 16GB of RAM - the rest is reserved). If you want the details check out chapter 5 of Computer Architecture, A Quantitative Approach by Patterson & Hennessy.

-"Zow"
1. Re:It's about size, not speed by SecretAsianMan · 2001-12-04 19:30 · Score: 4, Informative
  
  The transition of Digital machines was not so clear-cut. Yes, the PDP-11, which was made in various models from 1970 to 1990, was a very popular machine and was generally considered to be 16-bit. The first models had a true 16-bit address space, but later models improved upon that. The 11/45 introduced 18-bit addressing and separate address spaces for data and code. The 11/70 introduced 22-bit addressing.
  
  Also, Digital was already making 36-bit machines, starting with the PDP-6 in 1964. The 36-bit PDP-10, which cam in several flavors, was quite popular and spawned quite a culture of its own.
  
  Lastly, how can you not mention the 32-bit VAX? From 1976 to 1999 (!), this was Digital's 32-bit machine, and it was also very, very successful. By the time the Alpha hit the scene, the VAX had certainly taken over the supermicro/mini/supermini position formerly held by the PDP-11.
  
  --
  Washington, DC: It's like Hollywood for ugly people.
Average Joe Doesn't Push It by Nater · 2001-12-04 11:46 · Score: 4, Insightful

Think about what the average home/office user is doing on the computer and how much processing power it really takes to make that cursor blink. The simple fact is that for a typical office suite and web browser, current technology is overkill. Some people like to play audio, video, or games on their computers and that takes some more processing power, but it's nothing that pushes the limits of modern hardware (you gamers who say you can tell the difference between 100 and 125 FPS are lying... that's 1.5 to 2 times your monitor's refresh rate).

People are going to get the hot new toys because they're hot new toys and then be really disappointed when everything they've been doing doesn't get any better.

Somebody somewhere might develop the killer app that makes a 64-bit processor make sense for home and desktop users, and I can think of a few things that have the potential to take off like that, but until then the new hardware will basically be a "my dick is bigger than yours" type of thing. I honestly hope that killer app comes sooner rather than later because whatever it is, it'll be killer.

--
I like to play children's songs in minor keys.
"We're all sons of bitches now." --J. Robert Oppenheimer
1. Re:Average Joe Doesn't Push It by green+pizza · 2001-12-04 12:27 · Score: 4, Interesting
  
  Very well said. However, I did want to make one point.
  
  (you gamers who say you can tell the difference between 100 and 125 FPS are lying... that's 1.5 to 2 times your monitor's refresh rate)
  
  I typically can't stand gamers, but I do understand their desire for framerate. The typical PC game has no mechanism for holding a sustained framerate, nor are things like texture preloading handled with any sort of elegance. Most PC games use weak code and brute force to produce any acceptable output. As such, the machine with the highest average framerate is the machine that's the least likely to get into a situation where the framerate will drop below, say, 30 FPS... perhaps in a complex scene or hitting a spot of particularly bad code.
  
  That said, the nitpicking "this 3D accelerator is better because it's 10.2% faster" blurbs are mostly BS.
  
  Then there's the other end of the spectrum. The company I work for has an SGI-powered RealityCenter for engineering review and presentations. The 30-foot-wide screen is curved and lit by three Barco projectors. It's normally driven as either 3840x1024 super-wide using all three projectors, each driven by a graphics pipe. For more complex scenes, three pipes work in parallel to drive just one projector at 1280x1024. Most of our software is created in house with the help of SGI IRISPerformer and MultiGen-Paradigm Vega libraries. Aside from a few exceptions, the whole setup runs at a locked 60 Hz (60 FPS gfx and projector).
  
  For those that like tech specs, the machine behind the curtain is a Silicon Graphics Onyx2 installed in early 1999. It has 24 MIPS R10000 CPUs each with 8 MB of L2 cache and running at 250 MHz. 48 GB RAM and 1.8 TB of disk via four channels of gigabit fibrechannel. The graphics pipes are three InfiniteReality2 subsystems, each with four Raster Managers (64 MB of dedicated texture ram plus 320 MB of generic graphics ram per pipe). There's a DPLEX module on each pipe to allow all three to work in parallel when needed.
  
  If the bean counters approve, we should have a totally new Onyx3000 system installed by June 2002. After all, our current setup is about 3 years old... ancient by computer terms. Thankfully the projectors, lighting controls, and indeed most of the room (seating, conference table loft, etc) will be reused.
Other new features of 64-bit processors by jquirke · 2001-12-04 12:59 · Score: 5, Informative

Mainly it seems people are talking about the register width, precision, and of course address space.

Keep in mind the first Itaniums have a 64-bit virtual adddress space, while the physical space is limited to 52-bits I think.

The Hammer series processors are really just an x86 extension. They offer no where near the capabilities of Intel's fresh start with IA64.

Here are some of the features of the IA64:

-> Heavy use of ILP (Instruction Level Parallelism) - speaks for itself.

-> Predication - less branches taken and hence stalling. The conditional handling is done through a controlling predicate, rather than jumping. look at this C code:

if (!eax) ebx=VALUEB; else ebx=VALUEA;

Now the i386 code:

testl %eax,%eax
jz 1f
movl $VALUEA,%ebx
jmp 2f
1: movl $VALUEB,%ebx
2:

Now the IA64 code:

p2,p3 = cmp.ne r5,0 ;;
(p2)ld8 r4=$VALUEB
(p3)ld8 r4=$VALUEA
/* last two statements run in parallel */

Now whereas the i386 code jumps all over the place, stalling the CPU, the IA64 code uses the controlling predicate registers to decide (p2,p3)

->Huge register sets

r0-r127 are the general 64-bit registers, compare this to eax,ebx,ecx,edx,esi,edi,ebp?

p0-p63 are the predicates

As well as 128 82bit floating registers f0-f127

->Speculation

Normally you can't reschedule a load to run before a store because the addresses can overlap

*ptr=b;
some_code_that_does_not_touch_b_c_ptr_ptr2();
c=*ptr2;

Previously, you couldn't move c=*ptr2 prior to the start of this code because ptr could overlap the same memory as ptr2.

Now you can - basically the load (using the "advanced load" instruction) is performed anyway, which allocates an entry in an internal table, then the store, and _if_ that store overwrites the load the load is performed again. Hopefully though, this shouldn't happen often. And its more flexible and powerful than this, this was just a simple example.

-> Remappable registers - the registers can be mapped kind of in a way that memory can be paged - that way when calling a new function the stack is not necessarily needed to push and pop various registers.

->"Modulo" loop scheduling

The beginning of the next iteration of a loop before the last one has finished - the remappable registers "rotate" to give each iteration a new set of the virtual registers

->An interesting way of handling paging

Which reduces TLB flushes on task switches by tagging an entry in the page tables with a unique ID specific to a process - I'm not fully sure on the details on this since I've never looked at IA64 system programming.

Sorry I'm sounding like an Intel brochure, but it really is quite amazing if your coming from x86 programming background - IA64 is a lot more than doubling data unit sizes. I suggest if you're familiar with assembly programming read the IA64 manual at developer.intel.com.
Some advantages nobody's touched on yet.. by cmowire · 2001-12-05 06:20 · Score: 4, Interesting

Some advantages nobody's touched on yet..

1) Easier implementation of large filesystems. 2^64*512 bytes of disk space per filesystem should be good enough for quite a few increments of Moore's Law. Ditto for larger than 4 gig files.
2) 64 bit processors take better to certain implementations of NUMA. SGI's implementation of NUMA gives each processor a range of memory that is local to that processor. If you had a 64 processor NUMA cluster, you'd have 64 megs local to each processor with 32 bit processors. You could have a few gigs per processor with 64-bit addressing.
3) With 64-bit processors, it's easier to map a file to memory again, without needing to map individual chunks. Over the near term, you could map your entire disk drive to memory space.
4) There are cases (i.e. bit packing) that don't take too well to vectorized MMX/SSE/etc. processing but do take well to 64 bit registers.
5) The ability to segment your memory space without creating annoying limitations. As in, you can have the lower 8,388,608 terabytes of RAM reserved for the user and the upper 8,388,608 terabytes of RAM reserved for the kernel. As opposed to Windows 2k, which leaves 2 gigs for the user and 2 gigs for the kernel. With the possibility of 3 gigs for the user, if you are running a higher-end version.
6) The ability to cache a data structure in the RAM attached to a given machine instead of buying solid state disk drives or other such things.

--
Gentoo Sucks