Slashdot Mirror


Performance of 64-bit vs. 32-bit Windows Dual Core

mikemuch writes "ExtremeTech's Loyd Case has done extensive testing on the same dual-core Athlon X2 4800+ system to explore performance differences between Windows XP Professional x64 and good ole Win32. The biggest hurdle is getting the right drivers. There are a few performance surprises, particularly in 3D games."

15 of 319 comments (clear)

  1. Plenty of time to wait for 64 bit apps. by Godeke · · Score: 4, Interesting
    The good this article tells us is that the 64 bit OS doesn't cause any significant loss of performance for the 32 bit applications that will function under it. On the other hand, the only 64 bit to 32 bit comparisons they have also show almost no differences. I think this is the most telling:


    The good news is that 32-bit Far Cry (as of the 1.31 patch) runs fine under Windows 64-bit mode, with very little performance penalty. When we move to the base 64-bit version, we pick up a couple of frames per second at 1280x1024, but we defy anyone to actually notice the difference between 79.5 and 82 fps.

    The good news is that the enhanced version still clocks in at 80 fps. This bodes well for 64-bit gaming, as game developers can add substantial new content and detail without sacrificing performance.


    Desktop applications (even games) don't need the one thing that 64 bit computing really excels at: massive addressing space. A database server that is compiled to 64 bit code will have access to much more RAM, and thus have much better performance if RAM bound (which many DBs are). Meanwhile for POV-Ray the fastest result of 383 seconds was the 32bit application on 64 OS!

    I think that it is safe to hold off on 64 bit for your personal desktop until a larger share of applications are compiled with 64 bit optimizations, but unlike the 16 -> 32 bit shift, I suspect the results will be underwhelming except for extremely memory consuming applications.
    --
    Sig under construction since 1998.
    1. Re:Plenty of time to wait for 64 bit apps. by Bloater · · Score: 3, Interesting

      > Is it possible that diminishing returns is kicking in on the register set size, or simply bad compilers (or use thereof)?

      Bad compilers or more likely they haven't hand optimised their inner loops.

      Most high performance ia32 (Intel Architecture 32 bit) software has hand tuned assembler for the tight inner loops, but it takes time, experience and skill to create such assembler. Some discussions I've seen put recent gcc compiling generic C for amd64 at close to the performance of hand optimised assembler for ia32 on the same Athlon 64 (for tight inner loops).

      There was an article about an assembler version of a cryptographic function that showed amd64 was capable of a *huge* performance increase over ia32, due to its increased register set.

      However it can also come down to implementation quality. IIRC, benchmarks of early amd64 xeon chips showed that they performed worse than ia32 on the same chip for tests that athlon 64 shows a performance *boost* in its 64 bit mode.

    2. Re:Plenty of time to wait for 64 bit apps. by freidog · · Score: 3, Interesting

      Well x86 chips have pretty well developed methods of dealing the lack of registers.
      Register renaming eliminates, or at least minimizes most of the problems with a small register set.
      (Athlon64 has something like 72 integer registers and 122 90 bit FP registers (two of these are combined to make an XMM register for SSE vectors), almost all of which are availible in 32 bit mode).

      The extra achitectual registers will help with moderate to long term storage (more than a few dozen clock cycles between uses) as the programmer will explicity specify the data remains in the register, where as with current shuffeling it's up to the CPU (and to some extent how the renamed registers are inteded to work) to determine if a write to cache is in order, or not.
      And really with the longer storage times, you often have the flexibility to write out to L1 and schedual the load so that there's no penalty for the load. (ie issue the move back to the register the 3 clock cycles prior to when you need it that an L1 load usually takes).

      The new registers probably won't make all that much difference in the end. But the again, nothing from the move to 64 bit will be a major impact for a while (at leat on the desktop).

  2. After reading the benchmarks... by vandy1 · · Score: 5, Interesting

    I can only conclude that they made no attempt to use the extra registers. Of *course* an f'ing 32-bit system will outpace a 64-bit system; Why do you think most Solaris apps are still 32-bit?

    The reason why x86-64 is a win is because there are more registers as well. This allows compilers to do a better job.

  3. Better solution than Linux? by 00_NOP · · Score: 2, Interesting

    As I understand it, most users of a 64 bit Linux kernel are using a 32 bit (GNU? I want to avoid a religous war :)) userland, whereas this suggests Windows users can mix and match.
    Is there a Linux equivalent available?
    Having said all that I well remember getting MS to agree with me that there was a bug in their Win32 bolt on for Win16 that meant my software wouldn't run, but they then said they wouldn't fix it! No wonder I eventually switched to Linux... but that'sa whole other story.

    1. Re:Better solution than Linux? by WhiteWolf666 · · Score: 3, Interesting

      Windows has the same 32-bit cruft.

      With 32-bit apps, you need a 32-bit userland. That's the WoW64 bit; it's the 32-bit Windows on Windows cruft.

      The main difference is that the linux stuff is organized differently. lib is your 32-bit libraries, while lib64 is your 64-bit stuff.

      On Windows, the 'normal' location is where you would find the 64-bit libraries, and the WoW64 stuff is loaded from a separate directory.

      Implementation details: http://msdn.microsoft.com/library/default.asp?url= /library/en-us/win64/win64/wow64_implementation_de tails.asp

      Select Quote:
      The WOW64 emulator runs in user mode, provides an interface between the 32-bit version of Ntdll.dll and the kernel of the processor, and it intercepts kernel calls. The emulator consists of the following DLLs:
      Wow64.dll provides the core emulation infrastructure and the thunks for the Ntoskrnl.exe entry-point functions.
      Wow64Win.dll provides thunks for the Win32k.sys entry-point functions.
      Wow64Cpu.dll provides x86 instruction emulation on Itanium processors. It executes mode-switch instructions on the processor. This DLL is not necessary for x64 processors because they execute x86-32 instructions at full clock speed.
      Along with the 64-bit version of Ntdll.dll, these are the only 64-bit binaries that can be loaded into a 32-bit process.
      At startup, Wow64.dll loads the x86 version of Ntdll.dll and runs its initialization code, which loads all necessary 32-bit DLLs. Almost all 32-bit DLLs are unmodified copies of 32-bit Windows binaries. However, some of these DLLs are written to behave differently on WOW64 than they do on 32-bit Windows, usually because they share memory with 64-bit system components. All user mode address space above the 32-bit limits (2 GB for most applications, 4 GB for applications marked with the IMAGE_FILE_LARGE_ADDRESS_AWARE flag in the image header) is reserved by the system.


      It's a different methodolgy, but most likely one that works as well. I appreciate the Linux one better-- the "normal" 32-bit stuff lives in the "normal" places-- that way, you don't *need* an emulation layer for the 64-bit unaware apps. Rather, 64-bit aware apps know to look in the correct location for the libraries (well, they are told by the OS, anyways). The Linux Way (TM) is slightly more backward compatible, me thinks. You'll *never* experience a problem with a 32-bit app on a 64-bit linux system, while there are some bugs in WoW64 which will probably never be fixed, rather, they'll be 'phased out', in the usual MS fashion (ignored until irrelevant).

      Information on the Linux approach is here: http://www.hp.com/workstations/pws/linux/faq.html
      Mainly, when recompiling your apps to be native 64-bit, you need to observe the following:
      Simple. Just rebuild from scratch and the compiler will build 64-bit by default. This is true for most apps. However, some apps must be made 64-bit clean which means that the developers must review the code to get rid of any assumptions about 32-bitness, such pointer arithmetic issues. Some makefiles that explicitly declare paths such as /lib, /usr/lib and /usr/X11R6/lib might need to be changed to append "64".

      --
      WhiteWolf666 an exBush supporter. All you new-school,compassionate,save the children Republicans can rot in hell
  4. Standard phallacy by vlad_petric · · Score: 5, Interesting
    The main performance gain from going to x86-64 does not come from larger operands and larger addressing space. It comes from a cleaned-up instruction set architecture and, most importantly, from a larger set of registers. x86-64 has 16 general-purpose registers whereas x86-32 arguably has about 7 GPRs. For x86-32, a compiler generally allocates 2 or at most 3 registers to variables. For x86-64, it can utilize ~12. This greatly reduces the number of loads and stores to the stack. The performance gain comes from the fact that it's much faster to communicate via a register than through memory.

    BTW, I don't know about windoze, but in the Linux world going from 32 bits to 64 bits almost always seems to produce a performance gain of 10->20%. I personally tried a simulator I'm using with 64 bits (recompiled with gcc), and got a speedup of 12%.

    --

    The Raven

  5. Sad to say by jmoo · · Score: 2, Interesting

    I've been messing around with Windows long enough to remember the 16bit to 32bit application jump made many years ago (When Windows NT 3.1 came out). A lot of the same stuff was said, lack of 32bit apps, huge memory requirements (32 MB of memory!), poor driver support (not that 16bit windows was a lot better). Windows on Windows is nothing new, you still use WOW32 when access a 16bit app in XP.

    --
    The world isn't run by weapons anymore, or energy, or money. It's run by little ones and zeroes, little bits of data.
  6. Re:performance difference by JonLatane · · Score: 3, Interesting

    "Dual core" and "dual processor" are two very different things.

  7. I'm taking a big risk by asking this.. by markass530 · · Score: 3, Interesting

    but I figure this is the only place I can get a good answer. I was just getting into computers during the 16-32 bit shift, windows 95 etc. (I was 14) How come a new proccesor wasn't required like now? Whats the difference? No need for complete laymans terms, as I consider myself a pretty avid comptuer geek, but certainly no engineer.

    1. Re:I'm taking a big risk by asking this.. by redfieldp · · Score: 3, Interesting

      Intel x86 processors were already 32-bit as of the 386 processor. Therefore, when Windows went 32-bit, all the processors out there were already 32 bit. By contrast, until now all Intel and AMD processors have been 32 bit, with the only 64-bit processors being made by other smaller vendors. Therefore, the processor/OS upgrades are simply closer together this time, and it is more apparent. However, as another poster noted: the same driver/incompatibility issues were present when Windows went 32-bit, it was just that no one had to upgrade their hardware.

  8. Coding practices need rethinking... by mi · · Score: 4, Interesting
    Complex data-structures involve a lot of pointers -- all of which are twice bigger on 64-bit machines. Sometimes, this makes the pointers bigger (or comparable) to the structures themselves.

    Most obvious are char * fields. If the string is 8 characters or less, it is cheaper to just store in the structure (and pass by value, where possible).

    Considering, that most such strings (and substructures) are malloc-ed (with a couple of pointers worth of malloc's overhead), the case for embedding them becomes even stronger...

    --
    In Soviet Washington the swamp drains you.
  9. Re:Better solution than Linux? NOT! by troberto · · Score: 2, Interesting

    I have been using a 64-bit userland under Gentoo Linux for over 1 1/2 years. I still have to run some 32 bit apps, like openoffice. It all just works. Of course I have a lot more respect for Gentoo's support ;->

  10. Re:performance difference by Anonymous Coward · · Score: 0, Interesting

    Dual cores share the same cache. Dual processors each have their own caches. You should expect slightly better performance from dual processors, but you would probably have to measure carefully to detect it; I doubt you would notice it.

    By the way, the best thing about dual cores or processors is that if an app goes insane in a tight loop, you can still use the computer (and find and kill the insane app). So there are hang conditions that would lock up a uni-processor machine, that you can easily recover from with dual. This works equally well for dual cores or dual procs.

    Basically, dual cores give almost all of the benefit of dual procs, but cost less.

  11. For people scratching their heads... by Hosiah · · Score: 2, Interesting
    Wondering what the heck good all the extra processing power is good for? Because you may play games, compile apps, or even brute-force-crack your favorite target server, but there's one place where you're *guaranteed* to want faster hardware: generating 3D ray-traced graphics!

    Yes, I played the Sims, compiled gcc, ran Python chatterbots, had KDE in maximum eye-candy-mode and ran multiple processes in desktops 1-10, but the day I began trying to render a scene with transparent height-fields and looped ISOsurfaces and detailed meshes with fog media and twin area lamps is the day I discovered a new definition of "hardware performance"!