Are 64-bit Binaries Slower than 32-bit Binaries?

Re: OSNews by duffbeer703 · 2004-01-23 16:06 · Score: 5, Funny

In case anyone hasn't realized it yet, this article proves that OSNews is the most retarded website on the planet.

The typical story is titled like "A comprehensive review of the Atari ST". The contents are typically something like... "I found an old Atari ST, but my cdrom wouldn't fit in the 5.25" disk drive and mozilla wouldn't compile. So the Atari sucks"

I benchmarked a skilled Chinese abacus user against a C-programmer implementing an accounting system. The chinese dude figured out that 1+1=2 before the C-programmer loaded his editor, so the abacus is faster.

--
Conformity is the jailer of freedom and enemy of growth. -JFK

I read this yesterday. Here's a tip... by (1337)+God · 2004-01-23 16:06 · Score: 0, Interesting

I read this piece yesterday. Here's a tip for those of you who may currently or need to work on building an x86 to x86_64bit cross-compiler under the Linux operating system.

One of my tight friends, Dan Kegel (cute pic of him here, oh and he works for Google, so he's super-smart and rich! :-*), has something called the CrossTool at http://kegel.com/crosstool that should be of major help to anyone working with 64-bit Linux systems.

You may even be able to list it as COTS on your project even though it's free as in beer. In any case, I've tried it, it's sweet, you should try it, it works great for what it does, just like most *nix apps. I prefer having one small tool do something really well than one large software package do a bunch of things really crappily.

Anyway, stop by Dan's page and say hi. Tell him I sent ya ;-)

--

Background: 28/M/Bi-Sexual; Owner of a Linux company; MBA Harvard 2003; B.S. Comp Sci MIT 2000

It all depends... by paul248 · 2004-01-23 16:07 · Score: 5, Funny

It all depends on how many of those 64 bits are 1's. 1's are a lot heavier than 0's, so too many of them will slow your program down a lot. If you compare a 32-bit program with all 1's, it will run significantly slower than a 64-bit program with only a few 1's. It's simple, really.

Re:It all depends... by Anonymous Coward · 2004-01-23 16:10 · Score: 2, Funny

How do you figure? CMOS only uses energy when transitioning between one and zero, when both transistors are in the ohmic region (drawing current). I don't see how 1 is any more heavy than 0.
Re:It all depends... by Anonymous Coward · 2004-01-23 16:11 · Score: 4, Funny

Here on planet Jokeania, we laugh at his statement.
Re:It all depends... by jusdisgi · 2004-01-23 16:13 · Score: 0, Redundant

OMG

That was funnier than shit. Too bad I ain't got the mod.

--
Given a choice between free speech and free beer, most people will take the beer.
Re:It all depends... by Anonymous Coward · 2004-01-23 16:14 · Score: 2, Funny

Pointy haired boss: Dilbert! My laptop is awfully heavy. Is there anything I can do?

Dilbert: Sure! Just start randomly deleting things. All that data can be pretty heavy!!!!

(later)
Pointy haired boss: Hmmm...Windows? My house already has all I need! *click* Yes! That's gotta be like 5 pounds!
Re:It all depends... by Anonymous Coward · 2004-01-23 16:16 · Score: 1, Funny

does that mean you can optimize your code by constantly xoring variables with themselves?
Re:It all depends... by Uncle+Gropey · 2004-01-23 16:48 · Score: 5, Funny

It's not that the 1's are heavier, it's that they tend to snag in the system bus and take longer to travel than the smoother 0's.

--
My blog can kick your blog's ass
Re:It all depends... by paul248 · 2004-01-23 16:50 · Score: 2, Funny

That's why I modded my CPU to handle 0's and 8's instead. I call it the 8thlon XP.
Re:It all depends... by Frymaster · 2004-01-23 16:59 · Score: 4, Funny

it's that they tend to snag in the system bus and take longer to travel than the smoother 0's.
this reminds of "back in the day" when we ran a token ring network. when end users would complain about net outage we'd simply tell them that the token got stuck or, worse yet lost. fortunately, we have a backup token on floppy back in the systems room. it's an fddi token, mind you, so it's a bit bigger but if you don't kink the cabling it should work fine for now.

--
2 1337 4 u!
Re:It all depends... by appleLaserWriter · 2004-01-23 16:59 · Score: 4, Funny

1's are a lot heavier than 0's

On early systems, particularly before the 286, the mass differential between 0 and 1 was a serious issue. However, the 286's innovative pipeline system introduced a shift in focus from mass to width. As pipelines became increasingly narrow, words composed primarily of "1"s began to execute at a more rapid pace than those with a heavy weighting of "0"s.
Re:It all depends... by Art+Tatum · 2004-01-23 17:02 · Score: 3, Funny

Did you ever have them look for the token under their desks? More fun than telling them where the "any" key is. :-)
Re:It all depends... by 74nova · 2004-01-23 17:32 · Score: 1

the real problem lies in the amount of space they take up. you can clearly see that the length of the 1 is easily surpased by the circumference of the 0. they take more ink, more pixels, and clearly more cpu power to process. this isnt even taking into account the area they take up. the area covered by a 0 is far greater than that of a 1. on top of that, the area in the middle is simply wasted!

--
use your turn signal! you people act like it's divulging information to the enemy
Re:It all depends... by jsmarshall85 · 2004-01-23 17:44 · Score: 1

of course you know that they aren't really 1's and 0's, but a voltage representation of the two. a 0 would be a low voltage and a 1 would be a high voltage relative to each other. so for the computer to process a 1 it would be more of a drain since it has to provide more power to get all that voltage passed down the pipe. if the 0 was represented by 0 volts then you can see why processesing a 1 would be harder to do.

hope that clears things up for everyone :o) i was an electronics tech in the navy before i became a geek

--
Jerry Marshall
Re:It all depends... by Frymaster · 2004-01-23 18:04 · Score: 0, Offtopic

art tatum? shouldn't you be dead or something?

--
2 1337 4 u!
Re:It all depends... by fucksl4shd0t · 2004-01-23 18:15 · Score: 4, Funny

Man, I'm going offtopic, but back in my oil-changing days...
Some new guy had started working, and his neck was redder than desert sand. He told me that his girlfriend's car had a blinker out on the left and he replaced the bulb and the light didn't come back on. I asked him if he checked his blinker fluid. He said he didn't know what blinker fluid was. I told him that blinker fluid sits in a reservoir in the middle of the car, and when you make a turn the fluid flows in the opposite direction of the turn, into the blinkers, to make sure that the electrical connection is good.
He spent 3 hours the next morning, on his day off, calling up parts stores and asking them if they had any blinker fluid. Poor guy. I had to break it to him slowly...

--
Like what I said? You might like my music
Re:It all depends... by amaupin · 2004-01-23 18:26 · Score: 1

That's why I do all my machine coding in MS Word Times New Roman point 18... nice smooth fonts. You losers and your pixelly text editors!
Re:It all depends... by Anonymous Coward · 2004-01-23 18:52 · Score: 0

Man, you're a cunt, but that's the funniest thing I've heard all week.
Re:It all depends... by Art+Tatum · 2004-01-23 19:03 · Score: 1

I'm out on good behavior. But one false move and *snap*, back I go. I figure I can't get into too much trouble just posting on Slashdot but only time will tell...
Re:It all depends... by Canadian_Daemon · 2004-01-23 20:07 · Score: 1

Yah, im assuming it's a joke, I don't think anyone is _that_ stupid. Laugh sometimes, He's talking about wasted space inside the 0.

--
This sig is definitive. Reality is frequently inaccurate.
Re:It all depends... by Anonymous Coward · 2004-01-23 21:59 · Score: 0

Didn't I read this exact same story on Fark? Sounds like you're telling one of those cliche "CD-ROM/Cup Holder" kinda stories.
Did you also ask him to check his horn oil level?
Re:It all depends... by fucksl4shd0t · 2004-01-23 22:06 · Score: 1

You may have read it on Fark. Blinker fluid has been flowing around the mechanic business for years. I certainly didn't think it up, and I'm certainly not the first person who's tried to hoax someone with it. :) But my story is true.

--
Like what I said? You might like my music
Re:It all depends... by Anonymous Coward · 2004-01-23 23:10 · Score: 0

Parent is modded interesting? *slaps moderator*
Re:It all depends... by ari_j · 2004-01-23 23:11 · Score: 4, Funny

In high school, we put a girl up to getting her blinker fluid topped off at a service station. She went and asked about it, and the next day was quite irate with us. But that didn't stop us - within a week, we sent the same girl to go have the summer air taken out of her tires, to be replaced with winter air. Apparently she went back to the same shop to have them take care of this for her.

That's the difference between a natural blonde and a dyed blonde.
Re:It all depends... by Anonymous Coward · 2004-01-23 23:14 · Score: 0

...and we here at /. were geeks before we became geeks.
Re:It all depends... by Shanep · 2004-01-24 00:19 · Score: 1

Here on planet Jokeania, we laugh at his statement.

Ahh, but real geeks take more delight in humour that points out the truth in the geekiest manner possible.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
Re:It all depends... by Anonymous Coward · 2004-01-24 00:36 · Score: 0

That's the difference between a natural blonde and a dyed blonde.

Yes, but how well did she suck dick? Hey? Hey!?

I thought so.
Re:It all depends... by Shanep · 2004-01-24 00:39 · Score: 1

On early systems, particularly before the 286, the mass differential between 0 and 1 was a serious issue.

Yes, it was most evident during the paper tape and and punch card days, when it was actually tangible.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
Re:It all depends... by MeanMF · 2004-01-24 02:34 · Score: 1

Of course 1's are heavier - the 0's have a hole in the middle to make them lighter.
Re:It all depends... by Des+Herriott · 2004-01-24 04:55 · Score: 1

1's are a lot heavier than 0's

Huh? Everyone knows that 1's are only a bit heavier than 0's!
Re:It all depends... by Jason+Hood · 2004-01-24 05:03 · Score: 0

Um, mod parent as Gay...

--
Are you intolerant of intolerant people?
Re:It all depends... by spauldo · 2004-01-24 14:36 · Score: 1

Heh... we used to do this, but my all-time favorite was muffler bearings.

We knew this guy who thought he knew all about cars but couldn't find an alternator - you know the type, so full of shit it runs out their ears - Anyway, he had asked a friend of mine to work on his car (he said he hated working on chrysler products) because it was making a noise. My friend sent him to the store after muffler bearings... Man was he pissed :)

--
Those who can't do, teach. Those who can't teach either, do tech support.
Re:It all depends... by fucksl4shd0t · 2004-01-24 16:37 · Score: 1

Your name isn't Javier, by any chance, is it? I used to work with a guy named that, and he was the first (and only) person I ever heard about muffler bearings from. Even when I went into exhaust later, the exhaust guys hadn't heard of muffler bearings. Not sure what to conclude from that, of course. :)

--
Like what I said? You might like my music

Short answer by Anonymous Coward · 2004-01-23 16:07 · Score: 0, Interesting

From reading the article, the answer is: Sometimes, depending.

More SCO code. by Ken+Broadfoot · 2004-01-23 16:08 · Score: 2, Funny

From the article:

I create a very simple C file, which I call hello.c:

main()
{
printf("Hello!\n");
}

Watch out... SCO owns this bit of code too...

--ken

--
Bitcoin pyramid: Join here: http://www.bitcoinpyramid.com/r/1427 it's FREE!

Re:More SCO code. by jusdisgi · 2004-01-23 16:10 · Score: 1

Too bad that's so off-topic; it's funny.

Why don't you pull it out again on a SCO story?

--
Given a choice between free speech and free beer, most people will take the beer.
Re:More SCO code. by jusdisgi · 2004-01-23 16:16 · Score: 4, Funny

Hey...that's funny...I just called a post off-topic that was a direct quote from the artical.

Cool.

Of course, the funny thing is that I'm right (in a way).

--
Given a choice between free speech and free beer, most people will take the beer.
Re:More SCO code. by Anonymous Coward · 2004-01-23 16:59 · Score: 0

It seems from the wording of your second post that you modded the parent before you RTFA. Thats classic.
Re:More SCO code. by ski2die · 2004-01-23 17:00 · Score: 1

I realize this is a joke, but it raises a generic question about copyrighting computer source code and using design patterns. Could the author of a book on design patterns tweak the fine print of their copyright notice in such a way that they could SCO anyone who took their example of a Singleton pattern and implemented it?
Re:More SCO code. by Anonymous Coward · 2004-01-23 17:37 · Score: 0

No, the version that SCO claims to own the copyright to is this:

void main()
{
printf("hello world.\n");
}

But no doubt SCO will claim that the code in question is significantly similar. It uses the words "void", "main" and "printf" in the same place, in the same way, and it achieves the same purpose, so it must be the same code, right? Add to that the fact that the formatting is so similiar, it's clear that someone obviously just changed the contents of the text string to make it appear to be different code.
Re:More SCO code. by root:DavidOgg · 2004-01-23 17:42 · Score: 1

Obviously not, you can't mod threads you post in.

--
--AROS is an Open Source AmigaOS clone, and source compatible with AmigaOS! Try the x86 build at http://www.aros.org
Re:More SCO code. by jusdisgi · 2004-01-23 19:00 · Score: 1

True dat...no mods today...
But he's right, too...I didn't RTFA. Not even now. And probably not ever.
Hehe.

--
Given a choice between free speech and free beer, most people will take the beer.
Re:More SCO code. by Anonymous Coward · 2004-01-23 20:26 · Score: 0

I don't think your ID# is high enough to post such stupid sh*t.
Re:More SCO code. by Anonymous Coward · 2004-01-23 20:51 · Score: 0

Actually, that should be:

int main(void)
{
printf("Hello!\n");
return 0;
}
Re:More SCO code. by Anonymous Coward · 2004-01-23 21:29 · Score: 0

looks like someone has a pet troll stalking them :)
Re:More SCO code. by Anonymous Coward · 2004-01-23 21:41 · Score: 0

I love the way you try to correct the code, yet miss out the "#include ".
Re:More SCO code. by Ohreally_factor · 2004-01-23 22:04 · Score: 1

I was under the impression that one couldn't mod if one had RTFA.

--
It's not offtopic, dumbass. It's orthogonal.
Re:More SCO code. by Anonymous Coward · 2004-01-24 05:41 · Score: 0

The linker will resolve it.
Re:More SCO code. by agentforsythe · 2004-01-24 11:42 · Score: 1

SCO as a verb? fantastic! My vocabulary just increased ever so slightly
Re:More SCO code. by parksie · 2004-01-27 00:48 · Score: 1

Probably just as well I didn't bother reading the article, if their English writing style is as crap as their C. There's a fair few things missing from that code.

And no whining from anyone about implicit int, I'm talking C99; implicit types were a bad idea even when they were allowed.

architectural differences... by jusdisgi · 2004-01-23 16:08 · Score: 4, Informative

I can only assume that this is only going to be limited to SPARC...I mean, we've already seen the major differences between Itanium and Opteron dealing with 32 bit apps, right? Or is this a different question, since Opteron gets to run 32bit effectively "native"? And, at this point, when running 32 bit apps on a 64 bit chip, just what can "native" mean anyway?

--
Given a choice between free speech and free beer, most people will take the beer.

Re:architectural differences... by ebbomega · 2004-01-23 16:34 · Score: 1

Native is a very specific concept actually. If you can take a piece of code in its very base-bones assembler 1s and 0s and put it through a processor and obtain the expected result, it processes that code natively. If not, then it can't. Itanium cannot process 32-bit code natively. It needs an emulator which trades those 1s and 0s to a different format that the itanium can read (thus changing the input to the processor) and then the processor can read it. Obviously emulation would take a lot more clock cycles than native performance, and thus speed is greatly improved when using native processing as opposed to emulated.

--
Karma: Non-Heinous
Re:architectural differences... by calidoscope · 2004-01-23 17:10 · Score: 4, Informative

I can only assume that this is only going to be limited to SPARC...
Probably applicable to the G5 as well (and Alpha, PA-RISC, MIPS), which like the SPARC has pretty much the same architecture for 32 bits and 64 bits.
The Itanic has an IA-32 subsystem hanging on it - performance is really poor compared to the main 64 bit core. The Opteron has more registers available in 64 bit mode than 32 bit mode and should show some performance improvements just for that reason.
As has been said mucho times - 64 bit processors really shine when you have lots of memory to work with. Having said that, one advantage of 64 bits is being able to memory map a large file and can result in better performance even with much less than 4 GB of memory - witness the MySQL tests.

--
A Shadeless room is a brighter room.
Re:architectural differences... by Witsu · 2004-01-23 17:15 · Score: 1

So does that mean for apps that don't require more than 4Gb of memory, such as current games, they would run slower in 64bit mode on an Opteron if they were simply recompiled?
Re:architectural differences... by Anonymous Coward · 2004-01-23 18:03 · Score: 0

Even by your definition, the Itanium does runs 32-bit "natively" -- As far as any traditional x86 software is concerned, the Itanium is 100% the real deal. They even claim it boots CP/M-86, OS/2, and everything else.

However, even though Itanium does it all in hardware, it's still slow, so "non-native" really means "don't do that".
Re:architectural differences... by calidoscope · 2004-01-23 18:03 · Score: 1

Games for the SPARC, G5, etc would probably run a bit slower when recompiled for 64 bits, but the Opteron would likely speed up a bit (more registers available in 64 bit mode).

--
A Shadeless room is a brighter room.
Re:architectural differences... by Anonymous Coward · 2004-01-23 19:16 · Score: 0

Wow. Nice recap, jackass.
Re:architectural differences... by King_TJ · 2004-01-23 19:32 · Score: 1

Quite so... and this is probably why it's no small coincidence that the freshly released developer notes for the PowerMac G5 points out its ability to handle 2GB memory sticks, for a total capacity of 16GB of RAM in one machine.
(Currently, the advertising claims the G5 can utilize "up to 8GB of RAM" - but this is apparently only because it's pretty tough getting ahold of a 2GB PC3200 DIMM right now; not because of a limitation of the motherboard.)
Re:architectural differences... by Anonymous Coward · 2004-01-23 20:19 · Score: 0

You can access more than 4 gigs in 32 bit windows already: http://msdn.microsoft.com/library/default.asp?url= /library/en-us/memory/base/physical_address_extens ion.asp
Re:architectural differences... by MechaStreisand · 2004-01-23 21:34 · Score: 1

one advantage of 64 bits is being able to memory map a large file and can result in better performance even with much less than 4 GB of memory...
I've heard this stated many times but not once have I seen any evidence backing it up. The concept doesn't even seem like it would be any faster. Think about it this way: you're mapping a file to memory to bypass all the relatively complex filesystem code - on something that is completely I/O-bound. Why bother?

In fact, (rant mode on), I've even heard people say that you could get rid of the filesystem completely this way. Just map the entire hard drive to memory! Brilliant! Except there's no way to implement things like journalled writes without file system code. Or maybe even a versioning filesystem. Why is this a good idea again? Why should we throw away an abstraction that provides us with things we use now?

--
Disclaimer: IANAL. This post is, however, legal advice, and creates an attorney-client relationship.
Re:architectural differences... by Anonymous Coward · 2004-01-24 03:50 · Score: 0

Itanium can run 32-bit code very fast and does not need an emulator to use 32-bit pointers etc.
Or are you thinking about x86?
Re:architectural differences... by calidoscope · 2004-01-24 04:57 · Score: 1

Think about it this way: you're mapping a file to memory to bypass all the relatively complex filesystem code - on something that is completely I/O-bound. Why bother?
Let's say you have a 16 GB file that you need to rummage through (e.g. a database). With a 32 bit system, you have to first figure where the chunk of data is in the file, read it and then have the application determine how to map the data.
With a 64 bit system, you make use of the virtual memory facilities of the OS and hardware - accessing a chunk of the file just involves the OS and not you application. The speed up comes from not duplicating the OS's job of figuring what part of the file can be stored in memory.

--
A Shadeless room is a brighter room.
Re:architectural differences... by calidoscope · 2004-01-24 05:07 · Score: 1

Currently, the advertising claims the G5 can utilize "up to 8GB of RAM" - but this is apparently only because it's pretty tough getting ahold of a 2GB PC3200 DIMM right now; not because of a limitation of the motherboard.
Getting 2 GB PC2100 registered ECC DIMM's is easy - Crucial sells them for $899, 4 GB PC2100 registered ECC DIMM's are avialable for (gulp) $6999. You can stuff 16 GB per processor on a Sun US-IIIi box, but Sun only supports 4 GB (because they haven't qualified anything larger than 1 GB DIMM's).
The problem is that the PowerMac G5's use non-ECC memory and are about the only boxes that will use that type - the Opterons use ECC.

--
A Shadeless room is a brighter room.
Re:architectural differences... by Anonymous Coward · 2004-01-24 05:17 · Score: 0

Agreed, but then again this is slashdot. Most idiots on here define "32-bit code" to mean their UT or HL or Neverwinter Nights or whatever.
Currently only HP-UX and its compiler supports 32-bit native (along with 32-bit shared libs, dynamic loader, etc) on the itanium. I can sort of see why Windows wouldn't care too much for ia64 32bit native (they would have to bug all their ISV's to port again) but I'm SURPRISED Linux doesn't offer this support, i.e. a 32/64 Hybrid ecosystem.
Probably due to the fact gcc sucks and can't emit that 32bit code. Then again even in 64bit mode it can get only about 30% of the performance of the HP or Intel compiler.
Re:architectural differences... by Mojo+Trolljo · 2004-01-24 06:13 · Score: 1

Out of any applications, I would say games are inherently the least portable than anything else. This is why you see a ton of games on Windows 32bit but nowhere else.So I doubt a simple recompile is even feasible first of all. Also since you can't mix and match object code of different word size on any platform (even Opteron) you would require the entire runtime environment and associated DLL's to be available in 64bit mode like DirectX, etc.
But supposing we do have a 64bit game, will it run faster? That depends:
1. Did the on-disk data structures expand? If so, how good is your I/O subsystem (Opteron probably has the disadvantage here at least over Sparc boxes)
2. Did in-network data structures expand? If so, how good is your network subsystem (see point 1 about Opteron vs Sparc)
3. Was data alignment perserved when the structures expanded?
4. Did it take advantage of increased virtual memory and shared memory? Do you have enough RAM on your machine so that you don't start paging?
5. Does it have issues with data-cache misses because of larger data types? How big a cache you got?
Registers aren't the only thing in the equation...

--
This post was made by I, Mojo Trolljo, for you to read that was written by I who is Mojo Trolljo!
Re:architectural differences... by MechaStreisand · 2004-01-24 08:41 · Score: 1

Ahh, that makes sense. Thanks.

--
Disclaimer: IANAL. This post is, however, legal advice, and creates an attorney-client relationship.
Re:architectural differences... by Hoser+McMoose · 2004-01-24 11:29 · Score: 1

you would require the entire runtime environment and associated DLL's to be available in 64bit mode like DirectX, etc.

That's exactly how Linux for AMD64 does things, it has two separate copies of all the libraries. This is also how Sun did things with Solaris, as seen in the article.

However, Microsoft is taking a different approach. In their 64-bit Windows, they have ONLY 64-bit libraries. The 32-bit code goes through a thunking layer to translate 32-bit library calls to 64-bit ones. There are, of course, some advantages and disadvantages to either method.
Re:architectural differences... by Anonymous Coward · 2004-01-24 23:30 · Score: 0

(not saying this in an angry/confrontational tone) - can you back this up? post a URL to the devnote?

Re: OSNews by Anonymous Coward · 2004-01-23 16:09 · Score: 0

I beg to differ. OSNews rocks.

The Jeffrey -- since 1979

Moving more data by Sean80 · 2004-01-23 16:10 · Score: 4, Interesting

I'm no expert in this specific area, but I remember a conversation from a few years back abour the 32-bit versus the 64-bit version of the Oracle database. The guy I was speaking with was pretty knowledgeable, so I'll take his word as truth for the sake of this post.

In his explanation, he said something of the order of "if you want speed, use the 32-bit version of the binaries, because otherwise the computer physically has to move twice as much data around for each operation it does." Only if you want the extra memory mapping capability of a 64-bit binary, he said, would you need to use the 64-bit version.

I suppose in summary, though, it depends on exactly what your binary does.

Re:Moving more data by renehollan · 2004-01-23 16:15 · Score: 3, Insightful

*cough* wider data busses *cough*. 'course this does mean that 64 bit code on systems with 32 bit wide data paths will be slower, but, like, migration always involves speed bumps. I remember the a.out to elf transition pain days of Linux.

--
You could've hired me.
Re:Moving more data by momerath2003 · 2004-01-23 16:18 · Score: 2, Funny

...because otherwise the computer physically has to move twice as much data around for each operation it does.

64-bit computers have to physically move data around? I suppose I'll have to buy a grappling arm attachment for my G5 to get it to work. :(

--
I had but a simple dream, to destroy all humans.
Re:Moving more data by Anonymous Coward · 2004-01-23 16:29 · Score: 0

hehe. pure genius.
Re:Moving more data by Anonymous Coward · 2004-01-23 16:29 · Score: 0

when did *coughs* become voluntary?
Re:Moving more data by Waffle+Iron · 2004-01-23 16:33 · Score: 4, Informative

*cough* wider data busses *cough*. 'course this does mean that 64 bit code on systems with 32 bit wide data paths will be slower
By the same token, 32-bit code on systems with 64-bit wide data paths will move twice as many pointers in one bus cycle.
Today's CPUs almost completely decouple buses from ALU-level operations. Buses usually spend most of their time transfering entire cache lines to service cache misses, so if your pointers take up a bigger portion of a cache line, 64-bit code is still consuming more bus clock cycles per instruction on average no matter how wide your buses are.
BTW, 32-bit processors have been using 64-bit external data buses since the days of the Pentium-I.
Re:Moving more data by starm_ · 2004-01-23 16:47 · Score: 1

The you should just have 128 bit buses for 64 bit code. and you would get the same performance
Re:Moving more data by renehollan · 2004-01-23 17:06 · Score: 2, Insightful

Well yes, but there is an advantage to smaller pointers when you can get away with them *if the processor has native support for them*. While exploited in small object allocators, it isn't always the case that the CPU can gallop through instructions as fast as they can be fed to it, multiple functional units notwithstanding. Though, clearly this is an issue only with data and pointers at the closest cache level to the processing units.
So, memory bandwidth remains an issue, and I concede the point.
Still, buse widths tend to optimize around typical transfer patterns, and pointers tend to grow to to be "always big enough" -- the cases where we tailor pointers to be within smaller constraints are quite specilized. It's more convenient to have one pointer size -- does anyone remember the four memory models that Microsoft C compiler used to support (probably still does)? tiny (16 bit data and code pointers), small (16 bit data, 32 bit code, IIRC), large, and huge? 'Course that isn't a perfect comparison because of the brain dead segmented x86 memory architecture, but you get the idea. It was (is) a pain.
But, bus widths and memory capacities will grow to the point where the 64 bit code of tomorrow will be as fast as the 32 bit code of today, and the need to optimize further will occur only in esoteric bits of code.
Besides, with 64 bits, you can do fun things, like allocate different objects in different virtual memory spaces and use the memory management system to catch wild-pointer bugs (because no two different objects need be adjacent in the logical memory space).
On the whole the advantages outweigh the disadvantages, and the performance penalties will be moot quite shortly.

--
You could've hired me.
Re:Moving more data by dfung · 2004-01-23 17:14 · Score: 5, Informative

Oh, now I'll *cough* a little too.

Modern processors (which actually stretches back at least 10 years) really want to run out of cache as much as possible, both for instruction and data access. And they've never wanted to do it more than now when in the x86 world, the processor core and L1 cache are operating at 3200MHz vs. 400MHz for the RAM.

One thing that has to happen is that you make a bet on locality of execution (again both for instructions and data) and burst load a section of memory into the caches (L2 and L1, and sometimes even L3). In implementation terms, it takes some time to charge up the address bus, so you increase bandwidth and execution speed by charging up address n, but doing a quick read of n+1, n+2, n+3, and more on the latest CPUs. You only have to wiggle the two low-order address lines for the extra reads, so you don't pay the pre-charge penalty that you would for access randomly in memory.

That's good if you're right about locality and bad if you're wrong. That's what predictive branching in the processor and compiler optimizations are all about - tailoring execution to stay in cache as much as possible.

On a 64-bit processor, those burst moves really are twice as big and they really do take longer (the memory technology isn't radically different between 32- and 64-bit architectures, although right now it would be odd to see a cost-cutting memory system on a 64-bit machine). If all the accesses of the burst are actually used in execution, then both systems will show similar performance (the 64-bit will have better performance on things like vector opcodes, but for regular stuff, 1 cycle is 1 cycle). If only half of the bursted data is used, then the higher overhead of the burst will penalize the 64-bit processor.

If you're running a character based benchmark (I've never looked at gzip, but it seems like it must be char based), then it's going to be hard for the 64-bit app and environment to be a win until you figure out some optimization that utilizes the technology. If your benchmark was doing matrix ops on 64-bit ints, then you'll probably find that that Opteron, Itanium, or UltraSparc will be pretty hard to touch.

A hammer isn't the right tool for every job as much as you'd like it to be. I actually think that the cited article was a reasonable practical test of performance, but extrapolating from that would be like commenting on pounding nails with a saw - it's just a somewhat irrelevant measure.

I guess I'm violently agreeing with renehollan's comment about speed bumps - apps that can benefit from an architectural change are as important as more concrete details such as compiler optimizations.
Re:Moving more data by 74nova · 2004-01-23 17:36 · Score: 1

im no expert by any means, but dont you have addresses that are twice as long, too? surely looking up an address that is twice as long would take at least a little more time

--
use your turn signal! you people act like it's divulging information to the enemy
Re:Moving more data by starm_ · 2004-01-23 17:56 · Score: 1

but theoritically couln't you just double all the necessary buses and registers so that the CPU doesn't get any performance degradation. Everything works the same way but with doubble the word width.

I guess I can see that it coutld take more gates to process a bigger word size in each CPU cycle so that it would maybe cause a new limiting factor on your maximum clock speed.

Were you refering to that when you said there is an advantage to smaller pointers?
Re:Moving more data by root:DavidOgg · 2004-01-23 17:59 · Score: 1

isn't the address accessed in parallel?

--
--AROS is an Open Source AmigaOS clone, and source compatible with AmigaOS! Try the x86 build at http://www.aros.org
Re:Moving more data by starm_ · 2004-01-23 18:04 · Score: 1

Hey thats a pretty good resume you got there. You must cost big bucks. I went to high scool in montreal.

Don't bother answering my second question. I don't know what I'm talking about. I was just speculating. Im sure doubling everything would just be inneficient. I don't know enough about CPUs to make such claims.
Re:Moving more data by ckaminski · 2004-01-23 18:08 · Score: 1

You can indeed implement separate memory spaces per process in a 32 bit memory model. Having memory spaces based on a per thread model (for those platforms that support it) one heap per thread, with special access permissions to that thread, would go a long way to allowing us to prevent wild threads from trashing memory. Windows could have done something like this. Granted the performance hit could be horrible, but it's a great feature I'd love to see on Linux and/or Windows.
Re:Moving more data by nikster · 2004-01-23 18:35 · Score: 2, Interesting

"if you want speed, use the 32-bit version of the binaries, because otherwise the computer physically has to move twice as much data around for each operation it does."

if that was true, 16 bit would be even faster than 32. this is not the way electron shuffling works.

i think it's more a question of standardization: the entire PC world has been sworn in on 32 bit, and has optimized the last little bottleneck to perform best on 32 bit data (buses, registers, etc). throughout the entire machine, but probably most notably in memory subsystems...

there are always specialized apps which will benefit from 64/32/16 bit operations, but for the majority of apps, the memory optimizations will be the only factor.
Re:Moving more data by spoonboy42 · 2004-01-23 18:47 · Score: 1

A hammer isn't the right tool for every job as much as you'd like it to be.

Ah, but can't a Hammer also execute 32-bit code natively? ;)

--
Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
Andy Grove: "Not Much."
Re:Moving more data by Anonymous Coward · 2004-01-23 20:33 · Score: 1, Insightful

The other factor is that developers who are targetting a 32-bit machine will generally avoid things like "long long int" in their C programs because of the horrible performance of doing 64-bit ops on a 32-bit processor. Often programs are even optimized as written for 32-bit platforms, and moving to 64-bit will only hurt the performance until those design decisions are changed.
Re:Moving more data by soundsop · 2004-01-23 20:47 · Score: 3, Interesting

In implementation terms, it takes some time to charge up the address bus, so you increase bandwidth and execution speed by charging up address n, but doing a quick read of n+1, n+2, n+3, and more on the latest CPUs. You only have to wiggle the two low-order address lines for the extra reads, so you don't pay the pre-charge penalty that you would for access randomly in memory.

This is incorrect. It has nothing to do with charging the address lines. Loading multiple sequential locations is slow on the first access and fast on the subsequent bytes because a whole memory row (made of multiple words) is read at once. This full memory row (typically around 1kbit) is transferred from the slower capacitive DRAM storage to faster transistor-based flip-flops. The subsequent sequential words are already available in the flip-flops so it's faster to route them off-chip since the slow DRAM access is avoided.
Re:Moving more data by thogard · 2004-01-23 23:12 · Score: 1

16 would be faster if it was big enough but since its not large enough for address calculations, its slower over all. the 32->64bit translation doen't have the same advanatges as the 16->32 migration did so your point isn't valid.
Re:Moving more data by matfud · 2004-01-24 02:58 · Score: 1

The external address bus is often limited to 40
or so bits. There is no current need for a wider
address bus then this.

64 bit internal addressing is useful because of
the amount of virtual address space it provides.
That virtual address space is then mapped to a
much smaller physical address size (or to disks).

Current 32 bit machines can access more then 4Gig(2^32)
of physical memory. However each process that runs
is limited to 4Gig of virtual address space.

64 bit machines can allow each process to access
2^64 bits of virtual address.

matfud
Re:Moving more data by renehollan · 2004-01-24 04:40 · Score: 1

This is true, and points out why early optimization is a bad idea. Of course, taking this view, optimization is always too early, and therefore should never be done -- wait for the faster technology. But, that isn't practical.
Really, if you had the time, you'd make your optimizations conditional, and wait for the rest of the system to do a better job than you do.

--
You could've hired me.
Re:Moving more data by renehollan · 2004-01-24 04:54 · Score: 1

starm wrote: "but theoritically couln't you just double all the necessary buses and registers so that the CPU doesn't get any performance degradation. Everything works the same way but with doubble the word width."
Exactly!
Except, we haven't done that yet. That's the "speed bump" to which I alluded.
Now, others have pointed out that storing more smaller bits in cache uses, well, less cache, but I'd counter that more pointer in cache increases the chance that you'll reference something out of cache and the likelihood of going out to slow DRAM. So, there's a case where smaller pointers are going to bite you, not because they're smaller though, but because you're managing so many pointers (and keeping them small for storage reasons), that your all is probably vectoring (code- or data-wise) all over the place.
Look, there will always be some application where you have to tailer the algorithm and data structures to the details of the hardware. Been there, done that. But, in general 64 bit pointers on a post-modern processor (with bigger caches, memory, etc.) will be "as fast as" 32 bit pointers on a modern machine, or 16 bit pointers on a legacy machine, or 16 bit pointers on an 8080 (which had an 8 bit data bus). Faster, actually, because of all the other speedups.
But, the very first bleeding edge move from 32 to 64 bits will likely come with a performance penalty.
Me, I like to code Foo*, i.e. pointer to Foo, and let the compiler do the right thing. Having a switch to globally control the compiler is probably a good idea. But, having my share of "near" and "far" pointers in the old segmented architecture days of the 8086/8088/80286 was painful. Of course the hit between a near and far pointer was far worse then because of the segmentaton issue. Then again, who knows how many virtual address lookup tables you'll have with a 64 bit address space.

--
You could've hired me.
Re:Moving more data by renehollan · 2004-01-24 05:06 · Score: 1

on the smaller pointers...
No, I was thinking of small object allocators. Think dynamic memory allocation where you want to allocate, say, two bytes or even one dynamically, and you need a bunch of them.
Well, a traditional allocator generally maintains a doubly linked list (or other structure) of allocated or available blocks. This overhead swamps out the size of the small object you need and is grossly inefficient.
What you do, is not allocate one or two bytes at a time, but a medium-sized block of several hundred (or dozens if the block size is 10 bytes or so) in one fell swoop, and then allocate out of that smaller blocks of a given size (programs usually only deal with blocks of a small number of different known sizes).
Now, keeping track of the free small blocks within the larger blocks will involve making a linked list, yes? (well, you could use a bitmap, but that has 12.5% overhead if allocating 8 bit bytes). But, in each larger block, you only have a small number of smaller blocks, and counting (or indexing) them does not require big, wide 32 or 64 bit pointers. An example might help.
Consider an allocation where you need to dynamically allocate single eight bit bytes. You allocate a block of 255 of them, and use the numbers 1 to 255 to index them. Gee, that kind of number fits in a single byte, so you link the free list within the block that way. The only complication arises when returning a byte to the free list: which larger 255 byte block did it come from? Well, you can maintain a table of block starting addresses, and search within it to find the larger block from whence it was allocated. It helps a bit if you allocate the 255 byte blocks on 256 byte boundaries (waste a byte to contain the index to the head of the free list).
Now, searching the table of larger blocks allocated is an O(log n) operation, and returning a byte to it's correct block is an O(1) operation. Allocation is O(1): you keep a list of larger blocks of the same size.
Andrei Alexandrescu has an excellent chapter on small object allocators in his book, "Modern C++ Design".

--
You could've hired me.
Re:Moving more data by renehollan · 2004-01-24 05:11 · Score: 1

Doesn't Windows support thread-local storage, so you only share the global data you have to? Of course, a wild pointer in that area will still mess things up (rather like an irate bull in a china shop -- the bull has to be irate, though).
But, I was thinking of a separate address space for each object allocated within an object-oriented paradigm. No more falling off the end of arrays, or mis-casting pointers. I'd expect performance to suck, of course, as virtual memory tables get changed on each pointer dereference, but, one can dream.

--
You could've hired me.
Re:Moving more data by Hoser+McMoose · 2004-01-24 11:42 · Score: 1

You mean, like the Opteron? Or the UltraSparc III? Heck, why not just go whole hog and get a HUGE bus like the IBM Power4 has.

Of course, the real key is not how wide the bus is but how much data you can pump through it. As a result, the Opteron and the Pentium 4 have the same data rate to main memory (6.4GB/s) despite the fact that the Opteron has a 128-bit wide bus (but running at 400MT/s) while the P4 has a 64-bit wide bus (running at 800MT/s).
Re:Moving more data by ckaminski · 2004-02-05 02:31 · Score: 1

Windows, Linux, and many Unixes do indeed support TLS, but the heap is global to the entire process. What I'm thinking about, is a thread heap, such that each thread has it's own protected memory heap that cannot be tread upon by other threads (by far the #1 cause of memory screw-ups for my programs).

The benefit to having a thread heap is that by killing the thread, the OS can safely kill handles owned by that thread, files, heaps, sockets, devices, etc. Currently, it's too hard to safely kill a thread.

Couldn't time fix this? by Transcendent · 2004-01-23 16:11 · Score: 3, Insightful

Aren't there certian optimizations and, in general, better coding for most 32 bit applications (on the lowest level of the code) because people have used it for so long? Couldn't it just be that we need to refine coding for 64 bit processors?

Most "tech gurus" I've talked to at my university about the benefites of 64bit processing say that it is in part due to the increase of the number of registers (allowing you to use more at the same time, shortening the number of cycles needed). Could time allow us to write more efficient kernels, etc for 64 bit processors?

So either the code isn't good enough, or perhaps there's another physical limitation (longer pipelines, etc) on the chip itself? Correct me if I'm wrong.

Re:Couldn't time fix this? by Ken+Broadfoot · 2004-01-23 16:16 · Score: 3, Informative

"Most "tech gurus" I've talked to at my university about the benefites of 64bit processing say that it is in part due to the increase of the number of registers (allowing you to use more at the same time, shortening the number of cycles needed)."

Not just kernels. All programs.. however this happens in the compiler. Or assembly code. Not in "kernels" unless they are assembly code kernels..

Basically this test is moot without using compilers optimized for the 64 bit chips..

--ken

--
Bitcoin pyramid: Join here: http://www.bitcoinpyramid.com/r/1427 it's FREE!
Re:Couldn't time fix this? by Anonymous Coward · 2004-01-23 16:19 · Score: 0

Aren't there certian optimizations and, in general, better coding for most 32 bit applications (on the lowest level of the code) because people have used it for so long? Couldn't it just be that we need to refine coding for 64 bit processors?

Defanitely. real world tests show that hand optimised direct to interger assembly language 64 bit code blowz 32 bit out of the water. Compilers have had now 16 years????? to become optimized for 32 bit access with all kinds of tips and tricks but there is none for 64bit, in fact much 64bit code is still being slowed down by assumptions made by compilers!!!!!!. useing anything but gcc like maybe iBM or iNtels own in house compilers and u will see the diff.
Re:Couldn't time fix this? by Anonymous Coward · 2004-01-23 16:48 · Score: 0

"Increased number of registers" has nothing to do with whether or not a CPU is 32 bit or 64 bit. Register count is a migration issue when you're moving from a crappy register-starved ISA to a less crappy ISA that supports more registers. 32 bit PPC, for example, is not register-starved in the same way as IA32 so register count is irrelevant in moving from 32 bit to 64 bit PPC. Both 32 bit and 64 bit PPC have 32 GPRs.
Re:Couldn't time fix this? by destiney · 2004-01-23 17:03 · Score: 1

Not in "kernels" unless they are assembly code kernels.

Exactly what is an assembly code kernel? Is this some gross generalization? I'm not aware of any kernel that doesn't have it's syscalls done in assembly if that's what you mean. For example you can't have user space write() until you have kernel space write() done in assembly.
Re:Couldn't time fix this? by drinkypoo · 2004-01-23 17:34 · Score: 2, Informative

64 bit architectures do not automatically have more general purpose registers than 32 bit ones. x86-64 happens to have four times as many GPRs as x86, but that's a special case.
The benefit of a 64 bit processor is a larger address space and the ability to work on 64 bit data types much much faster than on a 32 bit system. More GPRs is an additional, separate benefit.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Couldn't time fix this? by tepples · 2004-01-23 17:41 · Score: 1

Not just kernels. All programs

The field of benchmarking uses "kernel" to mean "inner loop, especially a standardized one for benchmarking". The field of operating systems uses "kernel" to mean "the part of the operating system that accesses the bare metal on behalf of application programs". True, more people think of the OS definition.
Re:Couldn't time fix this? by iamacat · 2004-01-23 19:29 · Score: 1

I suspect that compilers are doing just fine - optimization algorithms shouldn't change much just because you double the maximum integer size. The problem is that programmers don't write code that takes full advantage of 64 bit.

Take OpenSSL, the first library benchmarked, for example. We all know RSA involves some arithmetics with large numbers. I am sure it would benefit nicely from higher-precision integers if the code didn't always just use 32 bit.

Or consider an application that manipulates a complex data structure stored in a 10GB file. With 32 bit, it would have to use file offsets to represent pointers, implement it's own page cache and worry about elements crossing page boundaries. Under a 64 bit OS, it can just mmap the whole file at a fixed address, use regular pointers and let the OS do the caching taking all the running processes into account.

Of course it makes sense to write code for the most common machines in use today. But then don't say 64 bit applications are larger or slower - it's just 32 bit applications compiled for 64 bit instruction set.
Re:Couldn't time fix this? by tigga · 2004-01-23 19:56 · Score: 1

Aren't there certian optimizations and, in general, better coding for most 32 bit applications (on the lowest level of the code) because people have used it for so long? Couldn't it just be that we need to refine coding for 64 bit processors?
I believe gcc just isn't right tool to do the job. 64-bit optimizations in gcc are non-existent AFAIK.
He should use Sun's compiler for benchmarks.
Re:Couldn't time fix this? by Anonymous Coward · 2004-01-24 03:08 · Score: 0

Most "tech gurus" I've talked to at my university about the benefites of 64bit processing say that it is in part due to the increase of the number of registers (allowing you to use more at the same time, shortening the number of cycles needed). Could time allow us to write more efficient kernels, etc for 64 bit processors?

The key here is to define what one is referring too when they say a CPU is 64 bit. For example Sun's SuperSparc processors are fully 64 bit except with respect to memory addressability. Thus if you're not constrained by the limitation of 32 bit memory addressing then the UltraSparc's 64 bitness won't buy you anything over the SuperSparc processors.

As far as decreased speed 32 bit pointers, variables, etc will double in size when increased to 64 bits. This means that memory requirements will grow. With todays multi-gigabyte capable system this doesn't appear to be a problem. But consider that processor cache is still at a premium. Larger pointers will require more room in the cache. Which means caches will have to increase (if possible) or less information can be stored in the same size cache. Is this difference noticable...the answer, as always, is: it depends. Probably not for most applications.

gcc? by PineGreen · 2004-01-23 16:11 · Score: 5, Interesting

Now, gcc is known to produce shit code on sparcs. I am not saying 64 is always better, but to be hones, the stuff should at least have been compiled with Sun CC, possibly with -fast and -fast64 flags...

Re:gcc? by PatMouser · 2004-01-23 16:54 · Score: 4, Informative

Yup! It turns out poorly optimized code in 32 bit mode and I shudder to think what the 64 bit code would look like.

And before you start complaining, that comes from 3 years coding for a graphics company where every clock tick counts. We saw a MAJOR (like more than 20%) difference in execution speed of our binaries depending upon which compiler was used.

Hell, gcc didn't even get decent x86 (where x>4) support in a timely manner. Remember pgcc vs. gcc?
Re:gcc? by ctr2sprt · 2004-01-23 17:21 · Score: 2, Interesting

gcc is known to produce shit code on computers. I find these benchmarks interesting not because of what they say about the hardware, but because of what they say about gcc. It would make me nervous if my 64-bit platform of the future were tied to gcc. I hope for AMD's sake that they are working very hard either on producing their own compiler (maybe they have and I just haven't heard about it) or making gcc stop sucking quite so hard.
Re:gcc? by Anonymous Coward · 2004-01-23 17:30 · Score: 0

didnt's know Postgres made compilers as well
Re:gcc? by tigga · 2004-01-23 20:03 · Score: 1

gcc is known to produce shit code on computers
Do you have gcc for abacus ? ;)
By the way one more free software compiler is TENDRA - www.tendra.org (I have no idea about it's performance though).
Re:gcc? by orbitalia · 2004-01-23 21:39 · Score: 2, Insightful

You mean like this portland compiler

Actually I wouldn't say that gcc produces particularly bad code on all computers, it's sorta average, but not bad. Certainly the 3.3.x series are alot better than 2. Pretty good at number crunching and it is more standards compliant than most.
Re:gcc? by Xua · 2004-01-24 01:35 · Score: 1

Look at the file crypto/bn/asm/sparcv8plus.S. It contains mulx and stlx instructions which are the 64-bit SPARC instructions. This is possible because even UltraSPARC sparcv8plus target is target for building 32-bit pointer application but allows 64-bit math. When OpenSSL configure script sees that you compile on any UltraSPARC, it enables assembler code that uses 64-bit math regardless of which memory model you've specified.

This is why 32 and 64 bit applications performed on the almost the same speed but 64 bit application was slower. They both used 64 bit math instructions, but 64-bit application also used 64 bit pointes.

From man of Sun CC compiler:

v8plus Compile for the V8plus version of the SPARC- V9 ISA.

By definition, V8plus means the V9 ISA, but limited to the 32-bit subset defined by the V8plus ISA specification, without the Visual Instruction Set (VIS), and without other implementation- specific ISA extensions. This option enables the compiler to generate code for good performance on the V8plus ISA. The resulting object code is in SPARC-V8+ ELF32 format and only executes in a Solaris UltraSPARC environment -- it does not run on a V7 or V8 processor.

Example: Any system based on the UltraSPARC chip architecture

Re: OSNews by Ninwa · 2004-01-23 16:12 · Score: 2, Informative

Well neither of you have provided any actual evidence proving they rock.. or sock... o.O -tromps off to OSNews to check out their benchmarks- I shall be back ^_^

I'll save you guys the read. by Sj0 · 2004-01-23 16:13 · Score: 1

Yes they are, but only by about 10-20%.

Makes me wonder what tricks AMD has managed to pull out of their hat to increase 64 bit performance by 20-30%...

--
It's been a long time.

Re:I'll save you guys the read. by archen · 2004-01-23 16:19 · Score: 2, Funny

The same tricks that boost the performance of their CPU model numbers 20-30% over their clockspeed? =P
Re:I'll save you guys the read. by HardCase · 2004-01-23 16:21 · Score: 3, Funny

Makes me wonder what tricks AMD has managed to pull out of their hat to increase 64 bit performance by 20-30%...

They didn't use an obsolete UltraSparc chip? ;-)
Re:I'll save you guys the read. by ParisTG · 2004-01-23 16:22 · Score: 4, Informative

Makes me wonder what tricks AMD has managed to pull out of their hat to increase 64 bit performance by 20-30%...

They added more registers to an architecture that had very few of them. This is likely where most of the performance increase comes from in 64bit mode on the Opteron, not from the fact that it is 64bit.
Re:I'll save you guys the read. by Anonymous Coward · 2004-01-23 16:25 · Score: 1, Informative

Makes me wonder what tricks AMD has managed to pull out of their hat to increase 64 bit performance by 20-30%...

No tricks. The benefit doesn't come from 64 bit-ness, it comes from other changes in the ISA when something is compiled in 64-bit mode. There are 8 more GPRs on AMD64 than on IA-32. More registers = less movs to/from cache = faster. Also the integrated memory controller can't hurt. Also it has 8 more SSE2 registers IIRC.

Note that none of these things is tied to AMD64 having 64 bit regs. Course there are plenty of 64-bit benefits too (PK ops like RSA are basically instantly 4 times as fast on 64-bit machines as compared to a 32-bit machine).

How mature are the compilers? by Anonymous Coward · 2004-01-23 16:14 · Score: 5, Interesting

The surmise that ALL 64 bit binaries are slower than 32 is incorrect...

At this stage of development for the various 64-bit architectures, there is very likely a LOT of room for improvement in the compilers and other related development tools and giblets. Sorry, but I don't consider gcc to be necessarily the bleeding edge in terms of performance on anything. It makes for an interesting benchmarking tool because it's usable on many, many architectures, but in terms of its (current) ability to create binaries that run at optimum performance, no.

I worked on DEC Alphas for many years, and there was continuing progress in their compiler performance during that time. And, frankly, it took a long time, and it probably will for IA64 and others. I'm sure some Sun SPARC-64 users or developers can provide some insight on that architecture as well. It's just the nature of the beast.

Re:How mature are the compilers? by T-Ranger · 2004-01-23 16:49 · Score: 4, Insightful

GCC's primary feature is, has always been, and likey will be for a long time: portability. GCC runs on everything.
If you want FAST code you should use the compiler from your hardware vendor. The downside is that they might cost money, and almost definitly implement things in a slightly weird way. Weird when compared to the official standard, weird when compared to the defacto standard that is GCC.
I though this was common knowladge, at least amongst people who would be trying to benchmark compilers...
Re:How mature are the compilers? by MarcQuadra · 2004-01-23 19:17 · Score: 1

That's my POV too ,but I'm not a developer. I'm a sysadmin for the macs in a PC environment, and I'm advocating a wait for 64-bit apps to mature before we make a serious commitment. I can see now that most of the mainboards and CPUs purchased are 64-bit, but I know they're running 32-bit apps and OS's. I think the best bet for those of us waiting for the 'next big thing' is to wait for the CPU, chipset, and memory market to calm down and make a decision before we committ to another dead-end.

I'm taking care of too many i820 boards to jump to a quick conclusion here.

--
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
Re:How mature are the compilers? by Via_Patrino · 2004-01-24 03:17 · Score: 1

If you buy a computer today you are not expecting to wait until tomorrow to get its promissed performance. Buy when better compilers are avaible then, i think the benchmark is valid.
Re:How mature are the compilers? by Anonymous Coward · 2004-01-25 01:11 · Score: 0

I suppose I should point out that the algorithms tested (RSA, gzip, etc) all use 16 or 32 bit math. (gzip is optimized for 16 bits unless you hack it) This makes the test invalid because the 64 bit binaries are having to use tweezers, as it were. Re-write the code to use 64-bit integers and bit-strings and try again, the diffs will be smaller. But the real value of 64 bit binaries is the memory access -- a big database is nothing but a souped-up disk cache, and it loves more memory. 2 to 4 GB can be quite small when you've got half a terabyte to sift through.

Opteron is faster in 64 bit by citanon · 2004-01-23 16:15 · Score: 5, Informative

But that's only because it has two extra execution units for 64 bit code. 64 bit software is not inherently faster. Most people here would know this, but I just thought I might preemptively clear up any confusion.

Re:Opteron is faster in 64 bit by fifirebel · 2004-01-23 16:26 · Score: 5, Informative

Also because in 64-bit mode, the Opteron has access to more registers. The IA-32 architecture is so register-limited that throwing more registers at any task makes a huge difference.

And 32 bit is slower than 16 bit by gvc · 2004-01-23 16:15 · Score: 5, Interesting

I recall being very disappointed when my new VAX 11/750 running BSD 4.1 was much slower than my PDP 11/45 running BSD 2.8. All the applications I tested: cc, yacc, etc. were faster on the 16-bit PDP than the 32-bit VAX.

I kept the VAX anyway.

Re:And 32 bit is slower than 16 bit by operagost · 2004-01-23 16:34 · Score: 1

Should have run VMS.
:-P

--

Gamingmuseum.com: Give your 3D accelerator a rest.
Re:And 32 bit is slower than 16 bit by BeeazleBub · 2004-01-23 16:36 · Score: 1

That was also found out for x86 when OS/2 and Win95 (to a lesser extent because of its heavy reliance on 16 bit code in certain places) back in '92/93.

I can still feel the pain from the flame wars. If memory serves it all came down to an issue of compliler and library optimization. Most of the 32 bit code was unoptimized for various reasons which caused wide variances in performance.
Re:And 32 bit is slower than 16 bit by Anonymous Coward · 2004-01-23 16:46 · Score: 1, Informative

Don't forget (ugh) all the additional clock cycles needed for 16:16 and 16:32 bit addressing modes- (which we still have, unless you run an older Novell server which uses flat 32 memory mode.) Also, all the additional clock cycles needed to go from real to protected mode (stupid PC hardware stuff- but only on Windoze platforms) and "thunking"- also only on Windoze.

Yes, I remember the days of the first 386es and how much slower they were! I still have a 286-25 (I think...somewhere...) that kicked 386-33 butts.
Re:And 32 bit is slower than 16 bit by trb · 2004-01-23 17:26 · Score: 2, Interesting

Yes, and programs compiled for 16-bit PDP-11 running on the VAX-11/780 in "compatibility mode" were faster than the same programs compiled for 32-bit VAX native mode running on the same VAX. It makes sense, they were doing pretty much the same stuff, and fetching half as much data. But of course, 11's had limited address space, and the VAX address space was relatively huge.
Re:And 32 bit is slower than 16 bit by Anne+Thwacks · 2004-01-23 23:12 · Score: 1

I kept the VAX anyway.
Obviously you didn't have to pay the electric bill. I kept the PDP/11

--
Sent from my ASR33 using ASCII

Not so simple for AMD64 by martinde · 2004-01-23 16:21 · Score: 3, Interesting

My understanding is that when you switch an Athlon64 or Opteron into 64bit mode, that you suddenly get access to more general purpose registers than the x86 normally has. So the compiler can generate more efficient code in 64bit mode, making use of the extra registers and so forth. I don't know if this makes a difference in real world apps or not though.

Re:Not so simple for AMD64 by hobuddy · 2004-01-23 17:07 · Score: 2, Interesting

You can see official AMD benchmark results of various programs running on Windows XP 32-bit edition vs. Windows XP 64-bit edition beginning of page 36 of this PDF. The results have three columns: time in seconds on WinXP 32-bit w/ 32-bit executable, time in seconds on WinXP 64-bit with 32-bit executable, and time in seconds on WinXP 64-bit with 64-bit executable.

The results are quite impressive, but I'm not sure we can trust AMD, since obviously they want AMD64 to look good.

--
Erlang.org: wow
Re:Not so simple for AMD64 by afidel · 2004-01-23 19:13 · Score: 2, Interesting

AND you can get the best of both worlds because data sizes do NOT have to be 64 bit just to use the 64bit registers. In fact here is a quote from an AMD manual on driver porting for 64bit-Windows on AMD64 platforms:
*Stupid freaking AMD engineer made the PDF world accessible but then went and encrypted it so that I can't cut and paste, print it, or do anything with it*
Well basically INT's and LONG's remain 32 bit while Pointers, LONG LONG's and Floats are 64 bit by default.
The paper can be found at AMD's website

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Re:Not so simple for AMD64 by Anonymous Coward · 2004-01-24 02:34 · Score: 1, Informative

with gcc 3.3.2-r2 sizeof long on amd64 is 8 bytes..
Re:Not so simple for AMD64 by Anonymous Coward · 2004-01-24 17:41 · Score: 0

Let's all sing it together now... that's why all good programmers never, ever, forget to use typedef in their architecture file...

int32
int64
etc.

What I found most remarkable... by Grey+Ninja · 2004-01-23 16:21 · Score: 3, Interesting

The guy seemed to have his conclusion written before he started... Or at least that's how it seemed to me. When he was doing the SSL test, he said that the results were ONLY about 10% slower on the 64 bit version. Now I might be far too much of a graphics programmer.... but I would consider 10% to be a rather significant slowdown.

The other thing that bothered me of course was when he said that the file sizes were only 50% bigger in some cases... sure, code is never all that big, but... still...

Re:What I found most remarkable... by gnu-sucks · 2004-01-23 17:52 · Score: 1

The other thing that bothered me of course was when he said that the file sizes were only 50% bigger in some cases... sure, code is never all that big, but... still...

The 'code' is actually the same size. It is the resulting binary which has changed sizes.

I thought the article was very good, and I like the author's honesty in regards to how he did his testing, what was difficult, et cetra.

My only complaint, is that many things were mentioned twice. (eg, "I only ran the MySQL test twice because it took two hours to complete" is mentioned twice).

Overall great read though, got me thinking.
Re:What I found most remarkable... by Anonymous Coward · 2004-01-23 18:33 · Score: 0

The other thing that bothered me of course was when he said that the file sizes were only 50% bigger in some cases... sure, code is never all that big, but... still...

I don't know about SPARC operations, but x86 architecture uses variable length instructions, meaning that a single operation uses 1 to N bytes of code. Thus the jump to x86-64 won't double the size of binaries, because the code doesn't change from a fixed 32 bits-per-instruction to a fixed 64 bits-per-instruction, however constants, pointers, and other data compiled in DO change from 32 to 64 bits, so there will be some increase in the size of the finished binary.
Re:What I found most remarkable... by Grey+Ninja · 2004-01-23 19:06 · Score: 1

Yes. That's what I meant. The actual section of the binary occupied by the code is actually quite small. But what I was referring to is that a 50% increase in file size is quite significant in my books.
Re:What I found most remarkable... by gnu-sucks · 2004-01-23 20:45 · Score: 1

I agree, 50% larger, say, system-wide, would be a huge change.

Granted, most people could afford the hard drive. And your text files (/var, /etc, most of /home) wouldn't change.

He did play it off like that 54% increase was no big deal though, you're absolutely right.
Re:What I found most remarkable... by mindriot · 2004-01-24 03:40 · Score: 1

Yeah, his conclusions were also a bit too simple when he got to the size factor:

However, the difference wasn't all that huge, only around 16% to 54% larger for 64-bit than 32-bit. Unless the system was an embedded system with very limited storage space, I can't see this being all that much of a negative factor.

Two words: cache footprint.
Re:What I found most remarkable... by Anonymous Coward · 2004-01-24 07:00 · Score: 0

Yes, but the 54% increase was only on the smallest executable (gzip). *Everything* else was less than a 25% increase. It's likely that there's some fixed increase relative to the 32-bit version, plus a percentage increase, which overall is going to be much more noticeable percentage-wise on very small programs.

Re:*Why* do I have that feeling... by Anonymous Coward · 2004-01-23 16:22 · Score: 0

Read the fucking article. He didn't use hello.c for the benchmarking.

Fuck I hate slashdot idiots who complain about people who don't THINK before they post.

Practice what you preach, asshole.

Re: OSNews by rainwalker · 2004-01-23 16:27 · Score: 4, Insightful

Your "analysis" may be valid, but it's really not applicable. The title of the story is, "Are 64-bit Binaries Really Slower than 32-bit Binaries?" The author takes a 64-bit machine, compiles a few programs, and tests the resulting binaries to see which is faster. I'd say that the review is aptly titled and an interesting point to think on. Certainly he didn't compile every open source program known to mankind, as it sounds like he missed some pet app of yours. OpenSSL might be kind of arbitrary, but gzip and MySQL seem like reasonable apps to test. Like the last page says (you *did* RTFA, right?), if you don't like his review, go write your own and get it published.

Re: OSNews by juuri · 2004-01-23 16:27 · Score: 1

Good luck, you'll need it.

--
--- I do not moderate.

If 32bit is faster than 64... by CatGrep · 2004-01-23 16:27 · Score: 5, Funny

Then 16bit binaries should be even faster then 32.

And why stop there?

8bits should really scream.

I can see it now: 2GHz 6502 processors, retro computing. The 70's are back.

Re:If 32bit is faster than 64... by vrmlknight · 2004-01-23 16:40 · Score: 1

Actually you are correct a 2GHz 8-Bit processor would scream (for simple math and basic operations) but your limited to 8 bits which makes it tough to do some complex things that and it would be a real pain to write a program that used large/long numbers a lot would need to be swapped around :)

--
This must be Thursday, I never could get the hang of Thursdays.
Re:If 32bit is faster than 64... by Anonymous Coward · 2004-01-23 16:49 · Score: 0

Who cares? That's for the compiler to worry about.
Re:If 32bit is faster than 64... by mercuryresearch · 2004-01-23 17:52 · Score: 1

Actually, yes, a 8-bit would scream.

And since the 6502 only took a few thousand transistors, you could fit ~ 32,000 parallel 6502s on a single chip using today's processor manufacturing technology. Running at 2 GHz. Pretty impressive performance possibilities.

Of course, keeping a 256000 bit 6502 VLIW implementation fed with data would be next to impossible. But quite a few specialize components such as video processors do something like this.
Re:If 32bit is faster than 64... by Brandybuck · 2004-01-23 17:52 · Score: 4, Insightful

You're right. A 2GHz 6502 would be a screamer. But the drawbacks are numerous. When the world finally went to 32bit, I jumped for joy. Not because I thought stuff would be faster, but because I could finally use a flat memory space large enough for anything I could conceivably want. Integers were now large enough for any conceivable use. Etc, etc.

Of course, my conceptions back then might be getting a bit dated now. But not too terribly much. 32 bits will probably be the optimum for general use for quite some time. There's not too many applications that need a 64 bit address space. Not too many applications need 64 bit integers. We'll need 64 bit sometime, but I don't see the need for it in *general* purpose computing for the remainder of the decade. (Longhorn might actually need to a 64 bit address space, but that's another story...).

Remembering back to the 80286 days, people were always running up against the 16 bit barrier. It was a pain in the butt. But unless you're running an enterprise database, or performing complex cryptoanalysis, you're probably not running up against the 32 bit barrier.

But of course, given that you're viewed as a dusty relic if you're not using a box with 512Mb video memory and 5.1 audio to calculate your spreadsheets, the market might push us into 64 bit whether we need it or not.

--
Don't blame me, I didn't vote for either of them!
Re:If 32bit is faster than 64... by Saint+Stephen · 2004-01-23 19:32 · Score: 1

Integers were now large enough for any conceivable use.

Try doing currency arithmatic for a large multinational (> $4 billion, pretty common) where you have to sum thousands of numbers and exactness counts (so floating point won't work).

It can be done, but its a bitch.
Re:If 32bit is faster than 64... by juhaz · 2004-01-23 21:52 · Score: 1

Long longs.

Probably it's implemented internally with 2 32 bit ints, and maybe it indeed is a bitch - but hey that's for compiler forks to worry with.
Re:If 32bit is faster than 64... by Anne+Thwacks · 2004-01-23 23:15 · Score: 1

But unless you're running an enterprise database, or performing complex cryptoanalysis, you're probably not running up against the 32 bit barrier.
Dont worry M$ will save you - pretty soon they will have a version of Word that will need 128 bits and still won't work properly when trying to edit the format of tables.

--
Sent from my ASR33 using ASCII
Re:If 32bit is faster than 64... by Anonymous Coward · 2004-01-24 00:14 · Score: 0

So use long long if your compiler supports it. Or if you're willing to assume IEEE floating point, just use double and scale everything so that the values are integers.
Re:If 32bit is faster than 64... by Xenophon+Fenderson, · 2004-01-24 00:24 · Score: 1

Or even better, use a programming language with decent bignum support, instead of the portable assembler that is C.

--
I'm proud of my Northern Tibetian Heritage
Re:If 32bit is faster than 64... by juhaz · 2004-01-24 00:32 · Score: 1

Yeah, that's always a good option too.

Of course they need to implement it internally with some kind of kludge too if underlying architecture doesn't support it, but as long as it's transparent to user, who cares.
Re:If 32bit is faster than 64... by speighd · 2004-01-24 01:48 · Score: 1

Let's go all the way and run a 4GHz 4004! Just think of how fast we can calculate 4 bit integers!
Re:If 32bit is faster than 64... by Anonymous Coward · 2004-01-24 02:09 · Score: 0

When the world finally went to 32bit
Apple Macs were just born that way! Which world are you talking about?
Silly me, the one with Y2k, virus and worm problems.
Re:If 32bit is faster than 64... by bhtooefr · 2004-01-24 02:58 · Score: 1

Did you know that a Harris 286-25 could compete very well against an AMD 386DX-40 on 16-bit code? Didn't think so.

Just think of the heat a 2GHz 6502 would put out, though *puts a large heatsink on his 6502A@2MHz*

Could one overclock a 6502 that far, and it still be stable (LN2 with very high voltages?)
Re:If 32bit is faster than 64... by Anonymous Coward · 2004-01-24 02:59 · Score: 0

Why stop at 8 bits. Lets all use 4-bit 4004s. Wouldn't that knock your pants off!
Re:If 32bit is faster than 64... by Xenophon+Fenderson, · 2004-01-24 06:29 · Score: 1

Even better, I want it to be transparent to the programmer. I hate nothing more than wasting my time on nitty gritty implementation detail, especially when someone else has probably done it before (and did a better job than me).

--
I'm proud of my Northern Tibetian Heritage
Re:If 32bit is faster than 64... by Anonymous Coward · 2004-01-24 07:29 · Score: 0

Or you could use one of the many available multiple precision libraries, like the GMP or Berkeley MP. More work than built-in support, but all your other languages are built on interpreters using similar MP libraries anyway. Of course, currency is somewhat easy, since you just add/subtract it, for the most part. If you have to do a lot of multiplication/division/fancy operations like square root, well, then you're in trouble. Luckily, these situations usually don't require exact answers, and floating point is fine.

Jebus christ. by eddy · 2004-01-23 16:28 · Score: 2, Informative

This article sounds completely stupid. Someone didn't know that pulling 64-bits across the bus( reading/writing can take longer than 32-bits? Never thought of the caches?

Just read the GCC Proceedings, there's explanations and benchmarks of the why/how/when of x86-64 in 32 vs 64-bit mode, both speed of execution and image size.

--
Belief is the currency of delusion.

I'd kill for a 64 bit platform... by yecrom2 · 2004-01-23 16:31 · Score: 3, Interesting

The main product I work on, which was designed in a freaking vacuum, is so tightly tied to wintel that I've had to spend the greater part of a year gutting int and making it portable. Kind of. We currently use 1.5 gig of for the database cache. If we go any higher, we run out of memory.
We tried win2k3 and the /3gb switch, but we kept having very odd things happen.
This database could very easily reach 500 gig, but anything above 150 gig and performance goes in the toilet.

My solution...

Get a low-to-midrange Sun box that can handle 16+g and has a good disk subsystem. But that's not a current option. Like I said, this thing was designed in a vacuum. The in-memory data-structures are the network data structures. That are all packed on 1-byte boundaries. Can you say SIGBUS? A Conversion layer probably wouldn't be that hard, if it weren't build as ONE FREAKING LAYER!

Sorry, I had to rant. Anyway, a single 64 bit box would enable us to replace several IA32 servers. For large databases, 64bits is a blessing.

Matt

Re:I'd kill for a 64 bit platform... by Anonymous Coward · 2004-01-23 16:59 · Score: 1, Informative

Have you thought about using AWE? (Of course if you just used SQL Server instead of rolling your own database you'd get automatic AWE support...)
We tried win2k3 and the /3gb switch, but we kept having very odd things happen.
Besides possible bugs in your code, that might be because /3GB only leaves 1GB for the OS which might not be enough in some situations. On W2K3 you can try /userva.
Re:I'd kill for a 64 bit platform... by yecrom2 · 2004-01-23 17:29 · Score: 1

Have you thought about using AWE? (Of course if you just used SQL Server instead of rolling your own database you'd get automatic AWE support...)

Been there, done that. SQL server won't keep up. We didn't "roll our own" database, but the overhead of an RDBMS would kill any chance of keeping up.

Besides possible bugs in your code, that might be because /3GB only leaves 1GB for the OS which might not be enough in some situations. On W2K3 you can try /userva.

Bugs in my code? surely you jest! It could very well be 1GB kernel space issue, but even with 3GB it wouldn't be enough for the size of datasets that we would have if we had enough address space to enable us to have a larger dataset that we could have if we had a larger address space...

Matt
Re:I'd kill for a 64 bit platform... by Anonymous Coward · 2004-01-23 20:25 · Score: 0

http://msdn.microsoft.com/library/default.asp?url= /library/en-us/memory/base/physical_address_extens ion.asp
Re:I'd kill for a 64 bit platform... by miu · 2004-01-23 21:49 · Score: 1

The in-memory data-structures are the network data structures. That are all packed on 1-byte boundaries. Can you say SIGBUS? A Conversion layer probably wouldn't be that hard, if it weren't build as ONE FREAKING LAYER!
Weak. How did the original team convince the compiler that that was okay? Even on architectures that allow a misalligned memory access to complete it winds up slow as hell.

--

[Set Cain on fire and steal his lute.]
Re:I'd kill for a 64 bit platform... by Anonymous Coward · 2004-01-24 07:34 · Score: 0

Actually, all platforms have to be able to perform access on 1 byte boundaries (think C string manipulation). For example, even though MIPS use 4-byte aligned load/store instructions, it also has additional instructions to speed up byte manipulations. It's just the nature of computing that you have to work with non-aligned data sooner or later. It certainly doesn't violate the C standard.

The key is to get that non-aligned data into aligned formats, but that's only for performance reasons, and YMMV. For example, their setup may make perfect sense if they do a lot of scatter/gather I/O operations, as they'll want to minimize network utilization, yet they also don't want to spend a lot of time marshaling/unmarshaling data in memory when it's not going to actually be processed much.

Every situation is different. Unless you know more about the problem, judgement should probably be reserved (even if it sounds like this is another one of those mud ball programs).
Re:I'd kill for a 64 bit platform... by Anonymous Coward · 2004-01-24 07:37 · Score: 0

Your application would probably strongly benefit from AMD's 64-bit architecture. Since it's basically just x86 on steroids, you probably won't have to change the code to deal with things like byte order. On the other hand, porting an existing 32-bit application to 64-bits is never trivial, as you'll probably never be able to figure out all the corner cases where an assumption you made when doing the 32-bit code no longer applies in the 64-bit code. I think the main issue here is that programmers need to become more aware of 32/64-bit issues; the long reign of 32-bit computing has pushed many of these issues under the rug, so many inexperienced (and even experienced) programmers never think about them.
Re:I'd kill for a 64 bit platform... by yecrom2 · 2004-01-24 07:55 · Score: 1

The last place I worked, we had almost every unix platform that exists. I started doing all of my development on one of our Digital Unix boxes because it would show any 32bitisms. It's nice working on something that was started with portability in mind.

Matt

More bits doesn't automatically mean more speed by leereyno · 2004-01-23 16:32 · Score: 4, Insightful

The point of a 64-bit architecture boils down to two things really, memory and data size/precision.

An architecture with 32-bits of address space can directly address 2^32 or approximately 4 billion bytes of memory. There are many applications where that just isn't enough. More importantly, an architecture whose registers are 32-bits wide is far less efficient when it comes to dealing with values that require more than 32 bits to express. Many floating point values use 64 bits and being able to directly manipulate these in a single register is a lot more efficient than doing voodoo to combine two 32-bit registers.

So, if you have an problem where you're dealing with astronomical quantities of very large (or precise) values, then a 64-bit implementation is going to make a very big difference. If you're running a text editor and surfing the web then having a wider address bus and wider registers isn't going to do squat for you. Now that doesn't mean that there may not be other, somewhat unrelated, architectural improvements found in a 64-bit architecture that a 32-bit system is lacking. Those can make a big difference as well, but then you're talking about the overall efficiency of the design, which is a far less specific issue than whether 64-bits is better/worse than 32.

Lee

--
Muslim community leaders warn of backlash from tomorrow morning's terrorist attack.

Re:More bits doesn't automatically mean more speed by Anonymous Coward · 2004-01-23 17:52 · Score: 1, Funny

2^32 bytes of memory should be enough for anyone!
-new bill gates quote
Re:More bits doesn't automatically mean more speed by Weirsbaski · 2004-01-23 18:01 · Score: 1

An architecture with 32-bits of address space can directly address 2^32 or approximately 4 billion bytes of memory. There are many applications where that just isn't enough. More importantly, an architecture whose registers are 32-bits wide is far less efficient when it comes to dealing with values that require more than 32 bits to express.

Absolutely correct, but:

Many floating point values use 64 bits and being able to directly manipulate these in a single register is a lot more efficient than doing voodoo to combine two 32-bit registers.

So, if you have an problem where you're dealing with astronomical quantities of very large (or precise) values, then a 64-bit implementation is going to make a very big difference.

When doing math, the size of the floating-point registers has little to do with the size of cpu. Today's 32-bit x86 processors (and yesterday's 16-bit x86 processors) use 80-bit x87 (floating point) registers, for very precice math on very large values. But they're still 32-bit (and 16-bit) processors.

--

I am not a sig.
Re:More bits doesn't automatically mean more speed by Anonymous Coward · 2004-01-23 20:15 · Score: 0

64 bits also means that when you build hash tables, you don't have to worry quite so much about collision resolution.
Re:More bits doesn't automatically mean more speed by Anonymous Coward · 2004-01-23 20:28 · Score: 0

http://msdn.microsoft.com/library/default.asp?url= /library/en-us/memory/base/physical_address_extens ion.asp

Benchmarks by Anonymous Coward · 2004-01-23 16:33 · Score: 3, Funny

As it needs to be said for any benchmarking story:

There are 3 types lies. Lies. Damned Lies. ...and benchmarks.

Re:Benchmarks by Frymaster · 2004-01-23 16:55 · Score: 5, Funny

There are 3 types lies. Lies. Damned Lies. ...and benchmarks.
i've got some specint stats that show that damned lies are up to 30% faster.

--
2 1337 4 u!
Re:Benchmarks by Shanep · 2004-01-23 23:46 · Score: 1

There are 3 types lies. Lies. Damned Lies. ...and benchmarks.

Benchmarks don't tell lies. They usually tell truths about very specific areas which typically reflect "real world" applications to very varying degrees depending on how and how much those specific areas are used within the "real world" app. The specifics need to be analysed by someone who knows what they're looking for.

If you are in the market for a computer that will have a very specific role, then you might be interested in looking at benchmarks in great detail to find the best system for that specific role.

Benchmarks don't lie, people who interpret what they show, simply misjudge the numbers. Sometimes however, people who do understand what to look for, intentionally misrepresent benchmark numbers to increase marketability of their product. They are the liars, not the benchmark.

PS, I realise this is humour, but I often hear people state that benchmarks are useless. That is not true. They are very useful if their output is analysed properly. These are the same sorts of people who claim the Y2K bug to be all hype.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
Re:Benchmarks by DinosaurNeal · 2004-01-24 01:03 · Score: 1

Benchmarks do tell lies.

Benchmarks are meant to ideally test minimal pairs, the video encoding time difference between a p4 3.2 @266fsb and a p4 3.466 @266fsb doing. But in benchmarking scientific rigor is always lost, and this turns benchmarks into a rhetorical tool

But the benchmark choice is frequently meaningless or misleading.

Benchmarks do not elucidate any fact. You will always see in CPU tests LAME encoding. The p4 will always win against an Athlon. The reviewer will not explain why this is the case and that LAME encoding is simply clock cycle dependent. Thus the faster clocked p4 will always win over the slower clocked but more robust Athlon CPU. Benchmarkers need to be able to explain all the dependent variables, to tell why the results happen.

The fact that the quality of the difference of magnitude between variables is not meaningfully measurable leads to future design problems. In graphics cards Q3 benchmarks above a certain magnitude are meaningless. The reviewer with meaningless variables creates an inauthentic conditioned desire in the consumer that leads to bad and lax software and hardware engineering. Morrowind and other games have horrible problems with their graphics engine that can not be saved by faster GPUs and dx9.
Re:Benchmarks by Shanep · 2004-01-24 02:45 · Score: 4, Insightful

Benchmarks are meant to ideally test minimal pairs

And they often show disparity in their results due to being interupted. This would be a baddly carried out benchmark under less than ideal conditions. This is human error. Of course there are slight variations in subsequent runs, but these should be able to be explained and compensated for. It is most certainly not a benchmark lie though. If it took that long, then it took that long, now find out why!

But in benchmarking scientific rigor is always lost

Failing to retain a scientific approach is a human failing. It does not always happen and is not the benchmark telling lies, but due to poor procedure.

But the benchmark choice is frequently meaningless or misleading.

[poor] "Choice", "Meaningless" and "misleading" [results] each require an incompetent person. Don't blame the benchmark. Even if they wrote the benchmark, they might not understand the results.

Benchmarks do not elucidate any fact.

Yes they do. Very very specific facts which can later be used to make considerations for future decisions. It could be a specific application, algorithm, overall CPU ALU, FPU or single CPU instruction, it could be bus type, etc. Specific facts leading to educated decisions.

You will always see in CPU tests LAME encoding. The p4 will always win against an Athlon.

If this is the case, then LAME as it stands is specifically faster on a P4 than an Athlon. That would be a coarse benchmark though. Some would call it "real world". And it is. It is specific to LAME, but not specific at a lower level where it could be found why this might be the case and how to improve LAME on both P4's and Athlons seperately (with an end result that might have the Athlon out-perform the P4, due to new insight gained from benchmarking specific areas).

The reviewer will not explain why this is the case and that LAME encoding is simply clock cycle dependent.

So the reviewers fault becomes the benchmarks fault?

Benchmarkers need to be able to explain all the dependent variables, to tell why the results happen.

Thus my original statements?

In graphics cards Q3 benchmarks above a certain magnitude are meaningless.

Bad choice of benchmark is the fault of the benchmark?

Benchmarks need to be interpetted by someone competent enough to do so. Just because someone carried out a poor benchmark procedure or could not understand the results, does not mean the benchmark lied.

The reviewer with meaningless variables creates an inauthentic conditioned desire in the consumer that leads to bad and lax software and hardware engineering.

Incompetent reviewer, ignorant consumer, deceitful engineering.

Morrowind and other games have horrible problems with their graphics engine that can not be saved by faster GPUs and dx9.

So they are CPU bound? Memory? Sounds like maybe they don't know how to profile their code too well. When profiling, it helps to know how to benchmark and make meaning out of the results.

You cannot improve that which you do not understand, through anything other than luck. Benchmarks provide specific facts which, when correctly interpreted, can bring about improvements. People who can't interpret them, say they are meaningless.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
Re:Benchmarks by DinosaurNeal · 2004-01-29 23:37 · Score: 1

Your counterpoints are rhetorical. Your rhetoric lacked any empirical counterpoints. Give examples of benchmarks. Link a good case of benchmarking. Find a good benchmark. You are simply covering through weak counter argument some of the points that I was trying to uncover. You can't find good practices of benchmarking because there are problems in the foundations of the ideological and engineering in the tech industry. Why? Because technology is simply business.

There are barely any good consumer products produced by engineers. There are no good cars, no easily adaptive alarm clocks, etc. It's because the information concerning the production of products is private, and because there is no meaningful observation of the use of products. Visual Studio .NET 2003 would not look like it does if Microsoft ever spent time meaningfully watching people use the program. There is only the deep-rooted creation of unauthentic demand for badly designed products. For example Microsoft created a horrible install for Windows 2k and XP. It spends 5 minutes installing SCSI drivers. The installation should ask if you are running a SCSI device before loading the drivers. It then necessitates user input on what should be a user free situation. It should ask for a computer and user name at the beginning of the install and bypass the networking and time zone features in the install.

Benchmarks are units that are marked and interpreted as meaningful. You cannot separate the benchmark, the procedure, and structure of the hardware and software processes. The benchmark value and the procedure ideally elucidate the structure of the hardware and the procedure.

The fact that benchmarks are horrible marketing tools is an empirical fact. Benchmarks create an inauthentic desire for a horrible product. Prove that benchmarks do not quantify a misleading unit that represents measure of development.
( LAME encoding is not a meaningful unit of measurement for the differences between a Pentium and an Athlon (http://www.hardocp.com/article.html?art=NTc1LDI=) )

The fact that all hardware and software products are horribly produced is an empirical fact.

Here are some really (not very) good products: software raid, all computer speakers, Nvidia 5200 chips without zcompression, 128mb ram entry level graphics cards, the PIV under 2ghz.

The best audio cards RME, Digidesign, Creamwear, MOTU are simply cheap dsps, A/D D/A units, and minimal amounts of cheap ram (how much does ram cost again?) How much do these professional systems cost? How often does Digidesign force upgrades?

A quest: find a systematic review of a top of the line audio card or analog/dv encoding card/system and the relevant software. It does not exist.

Find a systematic review of various computer systems performance in Solid Works or Maya and the profitability and quality improvement of the various upgrade paths. It does not exist.

With current hardware why would we ever need performance reviews of business software?

All top of the line video cards drop by a value of 1/2 within roughly 6 months of their initial time of sale. People buy the top of the line cards not for performance reasons, but for purely social reasons. This neurosis does allow for a convenient business model that keeps the graphics hardware development steady, unlike audio hardware development.

Benchmarking as a science is not developed. If you are doing time tests you have to perceptions tests and so on.

There are no intelligent consumers. Go buy a new Suburban and spend $120 on some new kicks made by little children.
Re:Benchmarks by DinosaurNeal · 2004-01-30 00:31 · Score: 1

After writing all that I thought of the clearest possible manner to elucidate my point unless you attempt to obfuscate by simple obvious points that are meant to function as counter arguments.

1.
A benchmark has to be proved meaningful, that is prima facie has to be proven. In other words, the reviewer has to use a benchmark where the difference in units is a significant value.

A drop of 100 fps to x fps is significant for CS.
What is that minimum value of x.
If all future cards are greater than x, then that benchmark is not significant.

Analogy:
Is a consumer cars maximum speed (miles/hour) a meaningful unit? No because all cars produced can go over 70mph.
Mpg is a meaningful unit.
Total cost of ownership is a meaningful unit.
There reason body design (SUV, sports car) and maximum speed of cars are taken to be significant units is because people can be socially engineered on an ideological and psychological level to purchase cars based upon those signifiers. Signifiers do not correspond to an ideal world of truth, but to marked units that are meant to affect/construct decisions. Those are not real world benchmarks but synthetic potential decisions.

2.
The review ought to explain the cause of the differences of benchmark values through a full explanation of the lower level structure.

3.
Companies depend on signifiers to sell products. Companies use insignificant benchmark values to sell products. Companies focus on specific benchmark values (3dmark) instead of developing meaningful units of quality measurement. Case in point the GeForce 5200 was created to be an entry level dx9 compatible card. Nvidia simply cared about these signifiers in creating the card dx9 and RAM amount when creating this card. This card should never have been made. The 5700 should have been the entry level dx9 card, while Nvidia kept its Geforce4 mx line at 32 mb to 64 mb of ram.
Re:Benchmarks by Shanep · 2004-01-30 22:33 · Score: 1

Link a good case of benchmarking.

Go argue with NASA and the US Navy (whome I used to work with, in an electronic weapons engineering capacity).

Find a good benchmark.

I could be here all damn day. I could give low level examples down to the testing of gate propagation delays, to high level examples of final apps running on final hardware and everything in-between.

My quick answer would be: Any test which gives repetative meaningful results when run under a scientifically sound procedure.

You are simply covering through weak counter argument some of the points that I was trying to uncover.

Ha! When you use points to talk about benchmarking, you talk about large scale (real world) benchmarks of LAME and Q3 and propose their results to be indicative across the board! And you talk about weak arguments!

I tell you what then, prove this statement: "in benchmarking scientific rigor is always lost". What a bold, foaming at the mouth, nutcase statement. Paint the whole world with the brush you paint yourself with?

You CANNOT prove a generalization with a few weak examples. You usually can't prove most generalization.

You can't find good practices of benchmarking because there are problems in the foundations of the ideological and engineering in the tech industry. Why? Because technology is simply business.

Technology business comprises many facets. Engineers create with scientific approaches and marketing droids dull down the science to help sales. Witness Intel's MMX. The engineers called it Matrix Math eXtensions, the market droids called it MultiMedia eXtensions (how well would something with the word "math" in it sell!). Regardless of what it's called and how well it sells, it does make great improvments through a SIMD approach designed by engineers. The engineers may have been told to improve the multimedia experience and so they took a scientific approach to do it, because they know, through observation (simulation, benchmarking, etc), that multimedia often uses repetitive instructions on varying data.

Benchmarking does not have to be an entire application running on given hardware. It could be time to execute single instructions, for example.

There are barely any good consumer products produced by engineers.

Engineers take care of the technical details. Without them, we would only have pie-in-the-sky visionaries who would never deliver anything.

There are no good cars, no easily adaptive alarm clocks, etc.

Subjective crap. You have experienced every car and alarm clock that has ever existed and then based on your opinion of them, proclaimed that there are no good ones because you don't think they're good.

Looks like you've found a niche in the market for "good" cars! Quick, go make your millions designing, prototyping, refining and then mass producing, but without using any scientific rigor or benchmarking!

It's because the information concerning the production of products is private, and because there is no meaningful observation of the use of products. Visual Studio .NET 2003 would not look like it does if Microsoft ever spent time meaningfully watching people use the program.

Good stuff. Find one example, then use it to throw blanket statements.

There is only the deep-rooted creation of unauthentic demand for badly designed products.

So there is no demand for well designed products?! I really doubt that you mean this. I don't know you, but based on the way you write I see you are intelligent. That sentence though, is either ill thought out, or the author is deluded severely.

IDE HDD's sell like hotcakes because end consumers don't know of and rarely need SCSI. Yet there is a demand for SCSI by intelligent people (and some less intelligent)!

An IDE drive might have specs of: minimum sustained transfer rate 30MB/s, random see

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
Re:Benchmarks by DinosaurNeal · 2004-02-03 16:01 · Score: 1

Though not specifically stated but defined by the context of the text was my intention to semantically contain the extension of my propositions to the set of consumer and prosumer computer fields. That is why I limited my examples accordingly. Without warrant you enlarged the scope of the conversation simply so that you may argue yet another point. I only enlarged the scope for purposes of analogy.

If you would read carefully a statement such as "Benchmarks do not elucidate any fact" you would have seen that it was stated to mean simply that the graphing of numbers alone (such as a Q3 f/s or lame encode value in seconds) does not have any measure of significant value.
1. A proper computer benchmark article/paper should give a lucid explanation of the lower level cause of the values. The attempt to explain causality I believe is the root of any scientific enterprise.
2. A benchmark should also orient itself towards the subjectivity of the potential user. If it is a benchmark concerning time, then the question of the potential perception of that time should also be the focus and not simply the time or f/s value out of context.
3. A product is always being benchmarked. The reviewer should always see if a product makes economic sense answering questions concerning decision theory like, who needs a Gallatin core based system?
4. The websites and magazines should behave like real journalists and scientists. A benchmark article should link or reference other benchmark values from other articles. The reviewer should also reference experts and receive comments by experts.

I was not in any way relating my use of the term 'benchmark' to fields such as the measurement of the conversion efficiency of solar cells. The term 'benchmark' corresponds to computer benchmark sites that test such limited things as, winbench, cpu temperature in relation to heatsink, asio latency, etc.

It is my belief that large part there are no quality computer review sites/magazines that have good benchmarks. Anandtech and Hardocp are moderate in quality for standard computer hardware. Anandtech tries to explain the technology. Hardocp has good editorial that does not overvalue hardware like Tom's does. You cannot find any good benchmarks on any good professional audio cards or digital video equipment.

Subjective crap. You have experienced every car and alarm clock that has ever existed and then based on your opinion of them, proclaimed that there are no good ones because you don't think they're good.

Subjective versus objective? Subjective as in relative? Subjective as in imminent? Subject as in unverifiable?

I'm not using pure induction am I? I am using adduction.
I stand behind the belief that consumer products are engineered in the worst possible manner. This is done for various reasons but mainly because of market segmentation (SCSI, EIDE) and planned obsolesce.
It is an "objective" fact that a simple thing such as an alarm clock is horribly engineered. Here are 4 good engineering changes that would create a better alarm clock.
1. What is the percentage of alarm clocks with a capacitor backup? Almost all alarm clocks are completely battery dependent for backup.
2. How many alarm clocks are as easy to set as an older VCR? It takes 5 seconds to set the time and date on an older VCR (buttons with hour, minute +, -.) I have used alarm clocks that have taken upwards of 3 minutes to set, because they only had an increase value button that affected the minutes.
3. Alarm clocks most frequently use the perceptually brightest color, green. Why do we need an alarm clock that functions as a night light?
4. How many alarm clocks have settable sleep buttons? Why can we only sleep in for another 7, 9, or 11 minutes? Why is the alarm reset button so near to the repeat alarm button?
Re:Benchmarks by Shanep · 2004-02-04 04:35 · Score: 1

Without warrant you enlarged the scope of the conversation simply so that you may argue yet another point. I only enlarged the scope for purposes of analogy.

Can you blame me for enlarging an argument against such a broad statement? "Benchmarks do not elucidate any fact". They often show repeatable trends outside of finite machines and inside finite machines (when not interupted with poor procedure) they tend to show very repeatable trends. If there are no trends shown, then maybe there are no trends to show, but those results should normally be repeatable within a finite machine.

If you would read carefully a statement such as "Benchmarks do not elucidate any fact" you would have seen that it was stated to mean simply that the graphing of numbers alone (such as a Q3 f/s or lame encode value in seconds) does not have any measure of significant value.

I take a statement as it stands. Not what was "meant" by it. A values significance is in the eye of the beholder.

Q3 and LAME time trials, do not encompass benchmarking of a whole system. They do however show very specific values which are significant in those very specific areas. I don't use the word "specific" loosely here. I don't mean that a high score in a Q3 test means that the computer is good for 3D games, I mean a high score in a Q3 test means that that computer is gave that specific result while running that test! Intelligent intepretation of those results might bring areas of weakness forward to allow for improvement.

Finding a computer system to have overall significant value based on those benchmarks alone is a stupid conclusion to come to. Which comes down to what I have been saying all along: benchmarks of a particular area need to be carried out properly and their results interpretted properly.

If you set out to find the value of a computer and in doing so benchmark it through Q3 and LAME time trials, then you have failed to properly benchmark that computer. You have not found the value of that computer, but have found it's value in Q3 and LAME processing. The benchmark of choice did not fail to highlight its very specific area of measurement.

However, if the specific area you want your computer to be the best at, is LAME encoding, then running LAME on various files with various options and comparing other computers with the same procedure might be a good benchmark for that very specific computing requirement.

The perception I have of my Sun Ultra 10 (which I am lovingly writing this to you on), is that it absolutely sucks as a desktop machine for the things that I expect from a desktop machine. The integer performance (this is a broad statement, but still a specific statement) is terrible (on average) if I compare it with my other machines. My old iBook kills it. The memory performance is terrible also. I know I can increase the memory performance by adding more memory to use two banks and use interleaving. I have discovered the lacking memory perfomance through a benchmark, so I have found out a weak area that can be improved. I have yet to test the FPU, which people claim is much better.

So maybe I would assign this to file serving tasks?

I would find it not so good here either, so should I just assume that it is the poor CPU performance and sell it on eBay to get a good price while it still has exotic value?

Or should I run some benchmarks on it and find out that the disk performance is terrible because the IDE controller sucks arse?

It only has 128MB RAM at the moment, the IDE performance is giving me about a third of what I know the disk is capable of and because I am running Solaris 9 with all the bells and whistles in this small amount of RAM, it is paging a lot through the slow IDE controller! Benchmarks allow me to pinpoint the weakest areas.

Upgrading the RAM (avoiding slow disk paging) and avoiding the IDE controller by throwing in a SCSI card on the PCI bus with the least contention does wonders. Tha

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?

This guy is a tool by FunkyMarcus · 2004-01-23 16:34 · Score: 4, Interesting

First, anyone with half a brain already knows what his "scientific" results prove. Second, anyone with two thirds of a brain has already performed similar (but probably better) tests and come to the same conclusion.

And third, OpenSSL uses assembly code hand-crafted for the CPU when built for the 32-bit environment (solaris-sparcv9-gcc) and compiles C when built for the 64-bit environment (solaris64-sparcv9-gcc). Great comparison, guy.

Apples, meet Oranges (or Wintels).

Mark

Re:This guy is a tool by FunkyMarcus · 2004-01-23 16:36 · Score: 1

Assembly code vs. C code refers to the big-number library. No substitutions, exchanges, or refunds.
Re:This guy is a tool by pritchma · 2004-01-23 17:55 · Score: 2, Informative

Dude, he knew you were going to write this comment and so he included page 4 just for you. *grin*
Re:This guy is a tool by spinlocked · 2004-01-23 23:20 · Score: 1

I agree. This was a pointless exercise, particularly since he used GCC but mainly because we've known this ever since Solaris 7 shipped for UltraSPARC I.

This guy gets the 'No Shit, Sherlock' award.

--
# init 5 Connection closed.

Oh... ...bugger.

Something is wrong. by DarkHelmet · 2004-01-23 16:34 · Score: 5, Interesting

Maybe it's me, but how the hell is OpenSSL slower in 64 bit?

It makes absolutely no sense. Operations concerning large integers were MADE for 64 bit.

Hell, if they made a 1024 bit processor, it'd be something like OpenSSL that would actually see the benefit of having datatypes that bit.

Something is wrong, horribly wrong with these benchmarks. Either OpenSSL doesn't have proper support for 64 bit data types, this fellow compiled something wrong, or some massive retard published benchmarks for the wrong platform in the wrong place.

Or maybe I'm just on crack.

--
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i

Re:Something is wrong. by FunkyMarcus · 2004-01-23 16:43 · Score: 4, Insightful

Maybe it's me

It's you.

OpenSSL in the 32-bit environment as the guy configured it was doing 64-bit arithmetic. Just because the guy had 32-bit pointers doesn't mean that his computer wasn't pushing around 64-bit quantities at once. It's called a "long long".

In fact, as he had OpenSSL configured, he was using some crafty assembly code for his 32-bit OpenSSL builds that even used 64-bit registers. His 64-bit builds were using plain old compiled C.

But he didn't even know that.

Big whoop.

Mark
Re:Something is wrong. by Anonymous Coward · 2004-01-23 16:43 · Score: 0

It's probably based on some large integer package that isn't taking advantage of the 64 bits that are available to it... shrug.
Re:Something is wrong. by Anonymous Coward · 2004-01-23 16:49 · Score: 0

ssl also does a lot of character-by-character processing.
Re:Something is wrong. by Anonymous Coward · 2004-01-23 16:51 · Score: 1, Informative

Operations concerning large integers were MADE for 64 bit

How? Explain please. Crytographic algorithms perform logical operations on each individual bit. To set a single bit in a register, you have to do something like:

mov rbx, value // value is 0 or 1 shr rbx, 8 // shift right into bit 8 and rax, ~8 // mask off the target bit 8 or rax, rbx // or value into the register [perform operation on rax, etc]
Since you're operating on a bitstream - at the bit level - and each bit operation depends on others - my question is: how - in any way, does it matter that rax is 64-bit rather than a 32-bit eax? For simple math, yes, 64-bit will help. For bitwise logic and crypto - no. If anything - besides overflow - a large register size is a hinderance for operations like this.
Re:Something is wrong. by Anonymous Coward · 2004-01-23 16:54 · Score: 0

Or maybe I'm just on crack.

And just how long have you been working for SCO, exactly?
Re:Something is wrong. by Vellmont · 2004-01-23 17:09 · Score: 1

I don't know anything about the Sparc processor, but it's likely that there are hardware tradeoffs to running your code in 64 bit mode. There was a similar situation (though in reverse) with poor branch prediction with the Pentium Pro and 16 bit code (it had worse performance than a Pentium in 16 bit mode, so windows 95 actually ran slower on a PPro). Certainly it's true that 64 bit code will simply take up more cache space than 32 bit code.

The crypto routines would also have to be written to take advantage of the larger bitsize available.
I think the wordsize of most crypto routines is 32 bits, so you'd need to figure out a way to do two crypts at the same time to take advantage of the extra 32 bits.

The article should be more appropriately titled "Are Sparc 64 bit binaries slower than Spar 32 bit binaries", as the benchmarks have nothing to do with other 32 vs 64 bit platforms (Athlon 64, G5, etc)

--
AccountKiller
Re:Something is wrong. by ozzee · 2004-01-23 18:13 · Score: 1

don't know anything about the Sparc processor, but it's likely that there are hardware tradeoffs to running your code in 64 bit mode.
Not really, the only thing that is essentially different between a 64 bit executable and a 32 bit executable in a SPARC or MIPS is the size of pointers. So, no wonder - a 32 bit Sparc executable pushing around 32 bit pointers will probably be a tad bit faster that a 64 bit executable. However, the Opteron is significantly different, it has a whole new ISA with lots more registers and real 64 bit operations. An AMD64 executable is likely going to find bottlenecks in you memory subsystems much quicker than a 32 bit executable.
Re:Something is wrong. by harlows_monkeys · 2004-01-23 20:38 · Score: 2, Informative

How? Explain please
All public key systems currently in use depend on doing arithmetic on large integers. Let's start with the classical algorithms for addition/subtraction/multiplication/division.
The addition and subtraction algorithms are O(N) and multiplication/division is O(N^2), where N is the number of digits.
What is a digit? On a 32-bit process, it will probably be 32 bits. On a 64-bit processor, it will probably be 64-bits.
What this means is that operating on large integers, say, 1024 bits, will be twice as fast on the 64-bit process for addition/subtraction and 4 times as fast for multiplication/division.
Most large integer packages use Karatsuba multiplication instead of the classical algorithm. Karatsuba is O(N^1.58). On a 64-bit processor, that is 3 times faster than on a 32-bit processor.
Looking at it from the other direction, if on a 32-bit processor, using a given set of algorithms which are working in base B, you can do public key cryptography using N bits, then just by using the same algorithm, working in base B^2 on a 64-bit processor running at the same basic speed, you can in the same time do public key cryptography using 2N bits.
Re:Something is wrong. by Vellmont · 2004-01-23 20:58 · Score: 1

Pushing around a 64 bit pointer incurs more than a 10% performance penalty? Yick. I find that hard to believe. I'd still be more apt to believe there's some optimizations that aren't as optimized on the 64 bit side. It's either that, or gcc on the sparc makes really crappy 64 bit code.

--
AccountKiller
Re:Something is wrong. by thogard · 2004-01-23 23:33 · Score: 1

You have more crud to deal with in the register stack with 64 bits than 32 bits. That makes context swaps slower.
Re:Something is wrong. by Anonymous Coward · 2004-01-24 08:38 · Score: 0

Either OpenSSL doesn't have proper support for 64 bit data types, this fellow compiled something wrong, or some massive retard published benchmarks for the wrong platform in the wrong place.

Actually, blame Sun and the SPARCv9 designers. There is no 64x64->128 bit multiply op the SPARCv9 (unlike MIPS64, x86-64, Alpha, IA-64, etc, etc, etc). So the benefit of using 64 bit words vs 32 bit words on SPARC is pretty minimal. On most 64 bit CPUs you'll see at least a 4x improvement, and probably more depending on the characteristics of your cache/memory: bigger (fewer) words means better cache locality. But Sun fucked it all up. I was discussing this with a coworker a few days ago: he pointed out this deficiency would give Sun the opportunity to sell more SSL accelerator cards. I don't know if that was a factor, but it seems like something Sun would do.

I might test this... by after · 2004-01-23 16:35 · Score: 1

... one me and a friend get a Sun Ultra (dunno what one yet) in a few days. Hoperfully, we can set up somthing that requires a lot of number cruntching. Perhaps a 64-bit SETI@home, I dunno.

The cool thing is for programmers, is that with the right macros and functions, one can use a single 64-bit integer as (get this!) 2 32-bit integers. This will come in handy for games and sutch where two sets of common numbers are accesed frequently.

Re:I might test this... by Anonymous Coward · 2004-01-23 20:32 · Score: 0

Uh, yeah. It's called SIMD.
Re:I might test this... by Anonymous Coward · 2004-01-24 07:43 · Score: 0

Actually, the various SIMD extensions on x86 (SSE, SSE2, 3D Now!, MMX, etc.) can already do this on the existing 32-bit architecture, up to 128-bit (4 32-bit words) and even 256-bit combinations per instruction (they can also do it with floating point). That's sort of the neat thing about hardware... you can throw all sorts of stuff into it, and don't need to be constrained by the limitations of the "platform".

Anyone ever used WinXP-64bit edition? by CatGrep · 2004-01-23 16:36 · Score: 4, Interesting

We've got an Itanic box at work that has WinXP 64bit edition on it so we can build & test some 64bit Windows binaries.

It's the slowest box in the place! Open a terminal (oops, command shell, or whatever they call it on Windoze) and do a 'dir' - it scrolls so slowly that it feels like I'm way back in the old days when I was running a DOS emulator on my Atari ST box.

Pretty much everything is _much_ slower on that box. It's amazingly bad and I've tried to think of reasons for this: Was XP 64bit built with debugging options turned on when they compiled it? But even if that were the case it wouldn't account for all of it - I'd only expect that to slow things down maybe up to 20%, not by almost an order of magnitude.

Re:Anyone ever used WinXP-64bit edition? by Anonymous Coward · 2004-01-23 16:48 · Score: 0, Flamebait

in my opinion, that describes windows XP.

i have never seen a machine running XP with decent performance.
Re:Anyone ever used WinXP-64bit edition? by KewlPC · 2004-01-23 16:55 · Score: 1

Was WinXP64 specifically optimized for the Itanium? Probably not. The Itanium places a lot of burden on compilers, and Microsoft's compilers probably aren't generating optimal Itanium code yet.

And, of course, if Win64 is running in the Itanium's x86 compatibility mode, then of course it's going to be slower. Intel has advertised from the very beginning that the Itanium's x86 compatibility mode was just to help ease the transition to the Itanium, and has always said that it would run slower than the Itanium's native 64-bit mode.

I'm not a huge fan of the Itanium myself, but please try to get things straight.
Re:Anyone ever used WinXP-64bit edition? by Anonymous Coward · 2004-01-23 17:17 · Score: 0

Maybe because it's an alpha or beta, and they are fixing the bugs before they get to the optimisation?
Re:Anyone ever used WinXP-64bit edition? by JanusFury · 2004-01-23 17:37 · Score: 4, Insightful

The video drivers are probably not optimized for 64-bit at all. In fact, I wouldn't be suprised if the box doesn't have native drivers at all, and is using MS's standard SVGA/VESA drivers. Those drivers are slow and any PC using them is going to feel horribly sluggish, even if it has a 3Ghz P4.

--
using namespace slashdot;
troll::post();
Re:Anyone ever used WinXP-64bit edition? by CatGrep · 2004-01-23 17:48 · Score: 1

Was WinXP64 specifically optimized for the Itanium? Probably not. The Itanium places a lot of burden on compilers, and Microsoft's compilers probably aren't generating optimal Itanium code yet.

True. But the basic Itanium architecture has been out for a few years now. If it's proving to be that difficult to write compilers that generate efficient code for it, then maybe it's not all the fault of the compiler writers.

And, of course, if Win64 is running in the Itanium's x86 compatibility mode, then of course it's going to be slower.

I'm no expert on these issues, but if it was running on Itanium using the x86 compatibility mode, wouldn't that then imply that the same binaries could run on a Pentium? AFIK XP-64 bit edition only runs on Itaniums right now (there are plans to support AMD 64-bit processors soon, though) which would imply that it's not using the compatibility mode, but it's running native Itanium code. But as someone else mentioned, perhaps the video and other drivers are still x86, 32bit code which would significantly slow down IO.
Re:Anyone ever used WinXP-64bit edition? by Anonymous Coward · 2004-01-23 17:49 · Score: 0

I concur.
Re:Anyone ever used WinXP-64bit edition? by Anonymous Coward · 2004-01-23 20:34 · Score: 0

It is indeed the video drivers.

For proof, next time you're using that "slow" itanium running XP64, shrink the command console so it doesn't have to exercise the video driver as much and you'll see a huge performance boost.
Re:Anyone ever used WinXP-64bit edition? by turgid · 2004-01-23 23:10 · Score: 1

We've got an Itanic box at work that has WinXP 64bit edition on it so we can build & test some 64bit Windows binaries.
You have my deepest and most heart-felt condolences.

--
Stick Men
Re:Anyone ever used WinXP-64bit edition? by Tim+C · 2004-01-24 00:15 · Score: 1

You've worked with some crap PCs in that case. These days, 2.4GHz and 256meg of RAM is pretty-much entry level, but XP will fly on that sort of spec.

(Yeah, I know, that's a lot of power to be called "entry level", but that's the way PC tech goes; just try going to a high-street shop and buying a lower-specced desktop. You won't find very many of them.)

--
It's official. Most of you are morons.
Re:Anyone ever used WinXP-64bit edition? by ball-lightning · 2004-01-24 03:13 · Score: 1

You've worked with some crap PCs in that case. These days, 2.4GHz and 256meg of RAM is pretty-much entry level, but XP will fly on that sort of spec.

WinXP runs acceptably on my 1.2ghz T-bird (I don't spend too much time waiting) and with all the eye candy turned off, runs fine on my P2 380mhz Thinkpad.

That being said, I really wish I had an "entry level" PC =-P
Re:Anyone ever used WinXP-64bit edition? by Tim+C · 2004-01-24 03:26 · Score: 1

Well, like I said, "entry level" does sound increasingly like a misnomer. What's entry level now would have been far beyond the top of the range just 5 years ago.

I define entry level, though, roughly as the cheapest PC you can buy off the shelf without too much hunting around. Right now, it seems to me that is around the spec I quoted, plus or minus a bit.

For what it's worth, I've had a P4 2.4B with 512 meg of RAM for a year now, and I generally buy at the sweet spot of the price/performance ratio. So yes, I consider my PC to be more or less entry level :-)

--
It's official. Most of you are morons.

Re:*Why* do I have that feeling... by Drantin · 2004-01-23 16:36 · Score: 2

please, read the rest of the article, he just uses that as an example to show that the arguements he was passing to the compiler really were having an effect on the output, although I don't see why he ahd to do that considering what he does afterwards, that part was not the benchmark...

--
Actio personalis moritur cum persona. (Dead men don't sue)

Retarded article. by Anonymous Coward · 2004-01-23 16:38 · Score: 1, Insightful

Short answer: No.

Medium answer: If you're not a programmer, yes. Expect about the same speed, but maybe slightly less.

Long answer: Direct comparisons like this are in no way valid because the code is identical. It's the same algorithm running at the same clockspeed. Your compiler can't program. Think about this: There's only so much space taken up by a logical operation. The question:

"is this bit set to one? if yes, do this.. if no, do that"

..does not get any faster just because of the size of the register the single bit is contained in. It's still bound by the clockspeed. Programmers can rewrite algorithms to do certain things in parallel, but it's probably not unless it's a big memory operation, multimedia app, game or graphics package. For those it will be much better.

Which is why Intel is more concerned with clockspeed than number of bits.

Re:You forgot by larry+bagina · 2004-01-23 16:40 · Score: 0, Troll

score 0 insightful? this deserves a +5 at least!

--
Do you even lift?

These aren't the 'roids you're looking for.

what a bunch of bullcrap by Anonymous Coward · 2004-01-23 16:41 · Score: 1, Funny

No one had tested it before to my knowledge, so predicting the outcome was impossible.

yes, right. we predict only on things we've seen someone else do in the past.

you've got the right idea, mate...

Hello World As A Benchmark! by PissingInTheWind · 2004-01-23 16:43 · Score: 0, Offtopic

That has to be the most stupidest compiler test I've ever read.

And it's not the number of bits that is important: it's the size of them.

--

A message from the system administrator: 'I've upped my priority. Now up yours.'

Re:Hello World As A Benchmark! by xeon4life · 2004-01-23 20:26 · Score: 1

The Hellow World program was not a benchmark. RTFA.

--
Real programmers can write assembly code in any language. -- Larry Wall
Re:Hello World As A Benchmark! by Anonymous Coward · 2004-01-23 21:04 · Score: 0

You're new here, aren't ya?

More info in scoop, please by RyatNrrd · 2004-01-23 16:43 · Score: 1

The info on the Slashdot page should read more like an abstract or executive summary of the article. What we have here reads much more like an advertisement for an article.

Yeah, I could and should RTFA, but I object to posts on the front page of Slashdot being "teasers" for other people's news sites. The info, please.

Re:More info in scoop, please by Anonymous Coward · 2004-01-24 04:51 · Score: 0

Mod the parent up, you bastards! :-P

It's not Ohmic! by Anonymous Coward · 2004-01-23 16:44 · Score: 0

When you're referring to Metal Oxide Semiconductor Field Effect Transistors, most CMOS circuits use the Ohmic region when at steady state. The saturation region is when it draws current, doh!

Forward thinking by Wellmont · 2004-01-23 16:45 · Score: 5, Interesting

Well considering that manufacturers have been working like crazy to produce both 64 bit hardware and software applications, one could see that there is still some stuff to be done in the field.
What most of the posts are considering and the test itself are "concluding" is that it has to be slower over all and even in the end when 64 bit computing finally reaches it's true breadth. However when the bottlenecks of the pipeline (in this case the cache) and the remaining problems are removed you can actually move that 64 bit block in the same time it takes to move a 32 bit block.
Producing to 32bit pipes takes up more space then creating a 64bit pipe in the end, no matter which way you look at it and no matter what kind of applications or processes its used for.
However the big thing that could change this theory is Hyper Compressed Carbon chips, that should replace silicon chips within a decade. (that's fairly conservative estimate.

Of Course They're Going to Be Slower by GoldMace · 2004-01-23 16:45 · Score: 0, Troll

For the most part, there's little need for the extra bits, so you are just wasting computer time processing unnecessary bits.

Maybe you should all concentrate on making things more efficient, rather than relying on faster processors to make your crappy bloatware look fast.

I don't care if you are from the GPL camp or even Microsoft, everything out there from both camps is bloatware!

Computers in 2004 should actually be faster than computers from 1995. From all I've seen, because of the constant bloatware, this is not even the case, and may actually be the opposite.

Re:Of Course They're Going to Be Slower by DeathPenguin · 2004-01-23 16:56 · Score: 2, Interesting

If it makes you feel better, programs from 1995 tend to run a lot faster on modern hardware. Gzip a kernel on a 66MHz Pentium and then on 2GHz Opteron and you'll see what I mean.
Re:Of Course They're Going to Be Slower by GoldMace · 2004-01-23 17:39 · Score: 1

Not even all of them do though. Some of the OS "improvements" make software that once ran fast run slow, for instance, many DOS games, on XP on today's fast systems compared to on '95 on 1995 era systems, that is if they even run at all. I just want stuff to go fast, I'd give up a lot of "features", to do that, though no one, including Linus, seems to be helping me on that goal. I just want fast, not features, and sure as hell not "security."
Re:Of Course They're Going to Be Slower by caffeineHacker · 2004-01-23 18:24 · Score: 1

XP has to emulate about half the stuff since they moved over to the NT lineage after ME. So yeah an emulator will usually run slower than the real thing. As far as GPL having too many features try Gentoo from stage 1, you can customize what you do and do not want every step of the way. Linus works on the kernel which just handles hardware, memory management, etc. So the kernel hackers aren't responsible for monsters like xfree86.

A Makefile? by PissingInTheWind · 2004-01-23 16:48 · Score: 2, Interesting

From the article:
[...] you'll likely end up in a position where you need to know your way around a Makefile.

Well duh. What a surprise: compiling for a different platform might requires Makefile tweaking.

Am I the only one to think that was a dummy article wasting a spot for much more interesting articles about 64 bit computing?

--

A message from the system administrator: 'I've upped my priority. Now up yours.'

Re:A Makefile? by LoadWB · 2004-01-23 17:13 · Score: 3, Informative

I accept this article as dumbed down a bit for the lower end, non-guru user who is wooed by the 64-bit "revolution" but not technically saavy enough to understand the "32-bit faster than 64-bit" comments that continue to surface in many forums. If bean counters and cheap tech workers can be made to understand that there truly ARE benefits in 64-bit technology, then progress will not be held in place by beating the 32-bit horse to death -- even if it does run at hellaspeeds.

How many times have we slapped around these types of people with our new technology trout only to hear "Yeah, but $OLD_TECHNOLOGY is STILL being developed, and it's cheap. Why should we bother with $NEW_TECHNOLOGY." Yeah yeah, I know that technically 64-bit isn't NEW, but to these guys...

This is unfair comparison by superpulpsicle · 2004-01-23 16:49 · Score: 3, Insightful

Why are we comparing mature 32-bit software with 64-bit software in its infancy?

Re:This is unfair comparison by LoadWB · 2004-01-23 17:21 · Score: 1

For the same reason we compared 32-bit and 16-bit. Why Amiga users argued with Windows users. :)

But really, if we don't have comparisons like these then it's harder to justify the migration, and it's also difficult for the general public to gauge progress in the development of 64-bit technology in both software and hardware.
Re:This is unfair comparison by Anonymous Coward · 2004-01-23 18:41 · Score: 0

There were plenty of times my 6809 TRS-80 color computer "felt faster" than my IBM 5150.

Well, he's not cute.... by Anonymous Coward · 2004-01-23 16:49 · Score: 1, Interesting

but more to the point, why would you advertise your sexuality in a technical post? I mean, if a straight person (and believe me, there is no such thing as "bi", you're either straight, queer, or in denial about being queer) were to post "I love vagina" in every post, you'd rightfully make fun of him.

But we're supposed to care that you consider a man's ass a sexual input? Stop looking for so much attention and lets talk computers.

Re:Well, he's not cute.... by Anonymous Coward · 2004-01-23 17:34 · Score: 0

and believe me, there is no such thing as "bi", you're either straight, queer, or in denial about being queer

I assume you are gay then, but have had sex with women in the past. Otherwise, you lack the experience to tell anyone what their sexuality is. So, either fess up or shut up. Thank you.
Re:Well, he's not cute.... by Anonymous Coward · 2004-01-23 18:28 · Score: 0

why would you advertise your sexuality in a technical post?

Could it be because he's a troll? An obvious one?
Re:Well, he's not cute.... by Anonymous Coward · 2004-01-24 01:23 · Score: 0

He just said bi-sexual. He doesn't mention asses and vaginas as you have. Frankly, I find his "MBA Harvard" comment more irritating, though at least he's not bragging about his Mensa membership.

There are plenty of flaming straights here but I doubt you object to that. I do agree that the guy's not cute.

64bit vs 32bit and what that actually means by Anonymous Coward · 2004-01-23 16:53 · Score: 1, Interesting

My understanding of low level languages may not be comprehensive, however I am aware that for (lets use the simplest example I am familiar with) MIPS there are a number of registers for the storing of data that will be 'saved' and returned to the caller function, these registers are commonly known as $S0 - $S7, these registers have to be saved in the subroutine in order not to loose the volatile information stored therein.

for example: ...
sw $T0, 0($S0) ..

having more registers would allow you to bypass this step of writing the data to the mem address of $T0, you could use one of the new registers that are not volatile and store it there, thus removing the need for perhaps 5 instructions at a time on each return from a subroutine alone.

rather than the Store Word instruction (SW) you could just: ..
addi $T1, $ZERO, $S0 ..

which would not be lost in the return to the calling function.

further to this, and i'm not sure that the intel x86 performs the same way, when you wish to load a
large number, i think in MIPS its >8bit into a register (16 bit register size) you have to infact perform TWO operations to load ONE number.

basically you load the first (largest significant bits) first

number = xFFFF FFF0 ..
LUI $T0, xFFFF #load to the upper half of the
# register, because the address
# space only allows for 8 bit size
ADDI $T0, cFFF0 # add the second portion of the
# number. ...

on the basis that x86 shares some of these things, then 64 bit must be faster GIVEN even ground with compilers and so forth, these are assumed (EVEN THO THAT IS NOT THE CASE) because otherwise its all pissing in the wind.

if this has errors, forgive me, this is not my area of specialty by a long stretch.

-Archfile

Re: OSNews by chunkwhite86 · 2004-01-23 16:55 · Score: 3, Insightful

Your "analysis" may be valid, but it's really not applicable. The title of the story is, "Are 64-bit Binaries Really Slower than 32-bit Binaries?" The author takes a 64-bit machine, compiles a few programs, and tests the resulting binaries to see which is faster.

How can you be certain that this isn't simply comparing the efficiency of the compilers - and not the resulting binaries???

--
I'd rather be a conservative nutjob than a liberal with no nuts and no job.

Re: OSNews by dant · 2004-01-23 16:55 · Score: 2, Interesting

But what's the fsking point?

News flash: 64-bit apps are, usually, slightly slower than 32-bit ones. Duh. Any developer who's been around 64-bit environments for more than a few weeks knows this. It's not like there's some subtle magic going on here; bigger pointers means more data to schlep around.

I think your parent's complaint is that is sort of like a cursory analysis indicating that triangular wheels aren't quite so good as round ones. If you really needed to be told this, you aren't in the audience that the article sounds like it's trying to address.

Certainly, many applications need 64 bits to operate. That doesn't mean it's the best tool for all jobs. The tone of the article sounds like it's exploring some big question that nobody's thought about before, and that's just silly.

Re: OSNews by DNS-and-BIND · 2004-01-23 16:57 · Score: 4, Funny

Are you kidding? This guy is a genius. Not only did he actually figure out that the UltraSPARC-II processor is 64-bit, but he can actually use the file and time utilities! Most of the "linux admin" types I know who buy old Sparcs for the novelty factor end up putting linux on them anyway..."This Solaris stuff is too hard".

--
Shutting down free speech with violence isn't fighting fascism. It IS fascism!

What 'system of belief' is he following? by swordgeek · 2004-01-23 16:58 · Score: 4, Insightful

64-bit binaries run slower than 32? That's certainly the dogma in the x86 world, where 64-bit is in its infancy. That was the belief about Solaris/Sparc and the HP/AIX equivalents FIVE YEARS AGO maybe.

Running benchmarks of 32 vs. 64 bit binaries in a 64 bit Sparc/Solaris environment has shown little or no difference for us, on many occasions. If the author had used Sun's compiler instead of the substantially less-than-optimal gcc, I expect that his 20% average difference would have disappeared.

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban

Re:What 'system of belief' is he following? by LoadWB · 2004-01-23 17:19 · Score: 1

But you should also remember that the prolific market is SMB (Joe Businessman,) which will continue to use freely/cheaply available software such as gcc. If these markets do not see the value in moving to 64-bit and beyond, then we could potentially be stuck with exorbitantly priced 64-bit boxes languishing with an uncertain future. Think about what happened with similar technology in the past. IDE vs SCSI. Parallel vs any other better data bus. (It absolutely irked me that a pp-Zip drive could bring a considerable FAST computer to a stand-still.) And the list continues.
Re:What 'system of belief' is he following? by mcrbids · 2004-01-23 21:01 · Score: 1

Running benchmarks of 32 vs. 64 bit binaries in a 64 bit Sparc/Solaris environment has shown little or no difference for us, on many occasions. If the author had used Sun's compiler instead of the substantially less-than-optimal gcc, I expect that his 20% average difference would have disappeared.

But, for all intents and purposes, GCC is *it* when it comes to compilers.

I maintain several dozen servers - all running RH Linux, all using GCC.

I would never consider using anything else. GCC for me is just as much a part of the 64-bit question as the CPU!

For me, if gcc compiles 20% slower on 64 than 32, a 64 bit processer is 20% slower. I'll wait for another revision of gcc before considering 64 bits...

--
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Re:What 'system of belief' is he following? by swordgeek · 2004-01-24 11:00 · Score: 1

The problem is that you're living in a different world. In Linuxland, gcc most definitely is it. However, gcc is nowhere NEAR "it" when you're talking about Sun gear. Changing from gcc to Sun Forte would increase performance by an average of 15% off the top, on the Sparc/Solaris platform.

Basically, while admins use gcc all the time on Sparc/Solaris, any real projects purchase the Sun compiler and enjoy the extra performance that it brings.

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban

Re: OSNews by be-fan · 2004-01-23 16:59 · Score: 2, Informative

Because he used the same compiler, in 32-bit and 64-bit mode???

--
A deep unwavering belief is a sure sign you're missing something...

Re:*Why* do I have that feeling... by inode_buddha · 2004-01-23 17:02 · Score: 1

OK, I'll do that. The compiler does make a difference. I'm just thinking that we don't take the whole picture into account enough here on slashdot.

--
C|N>K

Re:And 16 bit is slower than 8 bit by rs79 · 2004-01-23 17:07 · Score: 1

In 1981 I worked at a place that built multiprocessor business computers with an interpreted langauge (I take the blame for "CADOL III") using 8-bit computers.

Word came down from management/marketing that we "have to go 16-bit", never mind that we demstrated we had a faster system built out of 8 bit stuff, marketing had to be able to sell 16 bit systems to run our langage that was based on an 8 bit stack model.

64 bit makes sense if you're rampaging through memory 64 bits at a time. But nearly 10K for a "hello world!" program? Oh, help. This was like 600 bytes on a PDP-11 in C, much less in assmbly.

I've seen things like Quark go from a 2 meg distro to a 300MB one going from 16 bit to 32 bit.

A 64 bit MS world terrifies me.

--
Need Mercedes parts ?

Well.. MySQL4 loves 64-bit by destiney · 2004-01-23 17:07 · Score: 1, Informative

I benched MySQL4 on a dual Athlon-MP system and it ran about 32% faster in 64-bit mode. Try it yourself is all I can say.

It was a sweet upgrade as I had been using the server in 32-bit mode the first couple of months having it.

Sophomoric article by Tuxinatorium · 2004-01-23 17:08 · Score: 1

It all depends on the specific architecture you're dealing with. neither is inherently faster. 64-bits just requires more transistors to deal with.

--
Repeal the DMCA!

It would depend. by BoneFlower · 2004-01-23 17:09 · Score: 0, Redundant

If the app dealt with numbers that need 64 bits to natively represent, and dealt with them primarily(like some sort of numerics program) then a 64 bit binary will probably win.

of course, they are by ajagci · 2004-01-23 17:12 · Score: 5, Informative

Both 32bit and 64bit binaries running on the same processor get the same data paths and the same amount of cache on many processors. But, for one thing, 64bit binaries use up more cache memory for both code and data. So, yes, if you run 32bit binaries on a 64bit processor with a 32bit mode, then the 32bit binaries will generally run faster. But the reason why they run well and all the data paths are wide is because the thing is a 64bit processor in the first place--that's really what "64bit" means.

64bit may help with speed only if software is written to take advantage of 64bit processing. But the main reason to use 64bit processing is for the larger address space and larger amount of memory you can address, not for speed. 4Gbytes of address space is simply too tight for many applications and software design started to suffer many years ago from those limitations. Among other things, on 32bit processors, memory mapped files have become almost useless for just the applications where they should be most useful: applications involving very large files.

Re:of course, they are by j3110 · 2004-01-23 18:01 · Score: 3, Interesting

Ummm... I beg to differ on the reasons...

Most 64/32bit hybrid machines probably just split the arithmatic/logic units in half (just takes a single wire to connect them anyhow). Having an extra ALU around will surely push more 32bit numbers through the pipe. It's not going be as fast as a 64bit optimized application would gain from having the combined operations should it need them though.

I'm beginning to wonder these days how much CPU speed even matters though. We have larger applications that can't fit in cache, page switching from RAM that isn't anywhere near the speeds of the CPU, and hard drives that are only 40MB/s on a good day/sector, with latency averaging around 5-6ms. CPU is the least of my worries. As long as the hard disk is being utilized properly, you'll probably not see significant differences between processor speeds. I'm a firm believer that people think that 500MHz is slow because the hard drive in the machine was too slow. Unless you are running photoshop, SETI, Raytracing, etc., you probably wouldn't notice if I replaced your 3GHz processor with a 1GHz.

--
Karma Clown
Re:of course, they are by ajagci · 2004-01-23 18:58 · Score: 3, Informative

Having an extra ALU around will surely push more 32bit numbers through the pipe

That's an additional reason. There are probably many other places that neither of us has thought of that have been scaled up to make a true 64bit processor and that benefit 32bit applications running on the same hardware in 32 bit mode.

I'm beginning to wonder these days how much CPU speed even matters though.

It matters a great deal for digital photos, graphics, speech, handwriting recognition, imaging, and a lot of other things. And, yes, regular people are using those more and more.

Unless you are running photoshop, SETI, Raytracing, etc., you probably wouldn't notice if I replaced your 3GHz processor with a 1GHz.

You probably would. Try resizing a browser window with "images to fit" selected (default in IE, I believe). Notice how that one megapixel image resizes in real time? CPU-bound functionality has snuck in in lots of places.
Re:of course, they are by Anonymous Coward · 2004-01-23 20:35 · Score: 0

http://msdn.microsoft.com/library/default.asp?url= /library/en-us/memory/base/physical_address_extens ion.asp
Re:of course, they are by Jeff+DeMaagd · 2004-01-24 03:46 · Score: 1

Do 64 bit binaries really use more code space? Really?

64 bit mode is the width of the register, not the length of the instruction word. Alpha's register width is 64 bit, but its words are 32 bit, IIRC, like a great many other 64 bit architectures. While the machine code may take a little more space (IIRC, never more than 50%) than ia32 for the same code functionality, that is easily accounted for in the RISC-CISC differences between the two architectures.

Re: OSNews by Anonymous Coward · 2004-01-23 17:14 · Score: 0

Which proves nothing. Since they are different platform, the code might not be as optimized in both of them.

There's always a trade-off by KalvinB · 2004-01-23 17:18 · Score: 5, Insightful

between precision and speed.

It's not surprising that 64-bit processors are rated much slower than 32-bit ones. The fastest 64-bit AMD is rated 2.0ghz while the fastest AMD 32-bit is 2.2ghz.

If you use a shovel you can move it very fast to dig a hole. If you use a backhoe you're going to move much slower but remove more dirt at a time.

Using modern technology to build a 386 chip would result in one of the highest clock speeds ever but it would be practically useless. Using 386 era technology to build a 64 bit chip would be possible but it'd be massive and horribly slow.

I'm still debating whether or not to go with 64-bit for my next system. I'd rather not spend $700 on a new system so I can have a better graphics card and then have to spend several hundred more shortly after to replace the CPU and MB again. But then again, 64-bit prices are still quite high and I'd probably be able to be productive on 32-bit for several more years before 32-bit goes away.

Ben

--
Work Safe Porn

Re:There's always a trade-off by ozzee · 2004-01-23 17:53 · Score: 1

It's not surprising that 64-bit processors are rated much slower than 32-bit ones. The fastest 64-bit AMD is rated 2.0ghz while the fastest AMD 32-bit is 2.2ghz.
The fastest Opterons now selling are 2.2Ghz.
64 bits does not actually change the CPU very much at all. In this case the AMD64 ISA does have a few more registers so that will speed it up a little and 64 bit memcpy's will push the memory subsystems a bit more so this is all good. The real issue is the extra memory requirements to load and store 64 bit pointers. This makes the memory footprint a little larger and hence causes some programs to run somewhat slower.
I would hang tight for a while on the purchase front and see what happens to the prices.
Re:There's always a trade-off by sirsnork · 2004-01-23 18:10 · Score: 2, Insightful

You have also fallen into the "Clock speed is the measure of speed" myth. AMD could have easily boosted the clock speed of the AMD64's simple by extending the pipeline, just as intel did with the P4 and have rumoured to do with the Prescott code. This gives you the ability to clock the CPU higher but it does less per clock cycle

--

Normal people worry me!
Re:There's always a trade-off by Anonymous Coward · 2004-01-23 18:28 · Score: 2, Informative

"Using modern technology to build a 386 chip would result in one of the highest clock speeds ever but it would be practically useless."

This is completely wrong. Clock rate is determined by your slowest pipe stage.

A modern P4 is a 20+ stage pipeline because they want to squeeze the logic into tiny little sections, so that they don't have any "big" pipe stages. This lets them ramp up the clock rate.

A 386-era design isn't going to be nearly that heavily pipelined. Since it has more logic per pipe stage, it will have a very slow clock rate by today's standards, even if you upgraded it to a modern fab process.

Plus, a 386 executes x86 instructions instead of "micro-ops" (the RISC-style instructions that are executed at the core of a modern pentium). Those instructions "do more" and require more logic to begin with.
Re:There's always a trade-off by chrome · 2004-01-23 19:06 · Score: 1

Well, I have my 2.0Ghz Athlon 64 3200+ overclocked to 2.2Ghz ... I am pretty sure I could take it to 2.4Ghz, the thing is running at about 34C right now ... and its stable.

I'm a bit frightened of overclocking it too much, I have no idea what all the voltage and other tweaks do and whether I should be tweaking them along with the CPU speed .. heh :)

Anyway. This machine is a screamer. Fastest thing I've ever seen...
Re:There's always a trade-off by aceh0 · 2004-01-23 20:11 · Score: 1

AMD has 2 64 bit chips is rated at 2.2 Ghz. That's the 3400+ and the FX-51. The Mhz speed of a chip and its 'rating' are inherently different heh.
Re:There's always a trade-off by Anonymous Coward · 2004-01-23 23:01 · Score: 0

there are plenty of forums to help you reach your overclocking zen mode. but if the other recent amd offerings are any indication, you should be able to overclock that chip more than the modest 10 percent you have going now.
Re:There's always a trade-off by p3d0 · 2004-01-24 02:17 · Score: 1

The fastest 64-bit AMD is rated 2.0ghz while the fastest AMD 32-bit is 2.2ghz.
First, that certainly doesn't mean the 64-bit chip is slower. AMD is not like Intel: they don't focus their design on stratospheric clock speeds, but on overall performance. The K8 core is faster cycle-per-cycle than the K7.
Second, there is a 2.2GHz Opteron, as these SPECjbb2000 results show.
Third, even if there weren't, Moore's law tells us that a 10% clock speed difference amounts to, what, 11 weeks? So the K8 would be caught up by April. So who cares?
Using modern technology to build a 386 chip would result in one of the highest clock speeds ever but it would be practically useless.
This doesn't make any sense. If you use modern technology, it's not a 386, it's a Xeon. That's what a Xeon is.
Maybe I just don't understand what you're saying.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Re:There's always a trade-off by Jeff+DeMaagd · 2004-01-24 03:49 · Score: 1

IMO, the AMD Athon 64 3000+ plus a basic decent A64 board aren't that cost prohibitive compared to a Barton. I don't think you'll have to spend $500 to upgrade to that.

Re: OSNews by be-fan · 2004-01-23 17:18 · Score: 4, Informative

GCC uses the same code generator for both Sparc32 and Sparc64.

--
A deep unwavering belief is a sure sign you're missing something...

Re: OSNews by DARKFORCE123 · 2004-01-23 17:20 · Score: 0, Redundant

Like the last page says (you *did* RTFA, right?), if you don't like his review, go write your own and get it published

I read the article and its another short article that is basic and cursory making some kind of grandiose conclusion from some tests that are laughable. Typical of OSNews.

And no I'm not going to write an article for you to enjoy.

Nit-picking: LD_LIBRARY_PATH vs crle? by LoadWB · 2004-01-23 17:26 · Score: 4, Interesting

The article mentions tweaking the LD_LIBRARY_PATH...

I was told a long time ago by a number of people I considered to be Solaris gurus -- not to mention in a number of books, Sun docs, etc. -- that the LD_LIBRARY_PATH variable was not only heading towards total deprecation, but introduced a system-wide security issue.

In its stead, we were supposed to use the "crle" command to set our library paths.

On all of my boxes I use crle and not LD_LIBRARY_PATH and everything works as expected.

Any pro developers or Solaris technical folks that can comment on this?

Re:Nit-picking: LD_LIBRARY_PATH vs crle? by kindbud · 2004-01-24 02:39 · Score: 1

Yeah, it is not necessary to use LD_LIBRARY_PATH or crle at all in the case describe in the article. The Solaris dynamic linker is capable of choosing from the sparcv7 or sparcv9 subdirectories of /usr/lib (and /usr/dt/lib, etc.), the correct 32- or 64-bnit library. The libs located directly in /usr/lib are not the actual libs. They are stubs for compile-time compatibility. They aren't used at runtime at all.

If the guy had experimented just a bit, he'd have found that simply putting 64-bit libs in /usr/lib/sparcv9 was all that he needed to do.

--
Edith Keeler Must Die

Re: OSNews by Endive4Ever · 2004-01-23 17:26 · Score: 4, Interesting

I put NetBSD on most of my Sparc hardware. Because then I can run and build from the same exact source tree of packages as I use on my Intel boxes. And run a kernel built from exactly the same source.

Which brings up a point: both NetBSD/Sparc and NetBSD/Sparc64 will run on an Ultra 1, which is a 64 bit machine. Why doesn't somebody install each NetBSD port on two seperate Ultra 1 machines. Then the benchmark comparision can be between the normal apps that build on both systems, running in parallel on two identical systems. Its exactly the same codebase except for the 32 or 64 bittedness.

--
---

Re:*Why* do I have that feeling... by inode_buddha · 2004-01-23 17:28 · Score: 1

Ok, I *hope* you don't take everything so literally as " use hello.c". It must be a sad life to always be so narrow.

BTW, -fbranch-probabilities looks interesting (gcc 3.2)

--
C|N>K

Re: OSNews by Anonymous Coward · 2004-01-23 17:33 · Score: 0

Yeah, I know what you mean...I thought Solaris was too hard as well.

Re: OSNews by Guuge · 2004-01-23 17:34 · Score: 5, Insightful

News flash: 64-bit apps are, usually, slightly slower than 32-bit ones. Duh. Any developer who's been around 64-bit environments for more than a few weeks knows this. It's not like there's some subtle magic going on here; bigger pointers means more data to schlep around.

That is the sort of "obvious" conventional wisdom that the article is questioning. In fact, 64-bit architecture means a lot more than pointer size, and merely counting bits is no way to estimate performance.

64Bit will be needed when Solid State memory comes by Bruha · 2004-01-23 17:36 · Score: 4, Interesting

When we get solid state hard drives and if they're reliable and fast as regular ram then ram will be gone and the SSD will take over. So in essence your machine may just allocate itself a huge chunk of the drive as it's own memory space..

Imagine a machine that can grab 16g for it's memory usage and your video card having a huge chunk for itself also. Along with your terrabits of information space if things pan out well enough.

Re: OSNews by Guuge · 2004-01-23 17:39 · Score: 1

Shouldn't a fair benchmark take advantage of 64-bit-only optimizations?

64bit HW vs 64bit OS vs 64 bit Apps by AtariDatacenter · 2004-01-23 17:41 · Score: 1

Somewhat related to what you said, the UltraSparc-II processor was 64 bits *and* the OS was still 32 bits (Solaris 2.5.1). It was only really starting in the world of Solaris 7 that people were given the option to compile 64bit code for a 64bit OS. And they ran into a very small performance hit on the application. (Well know, at least, among system administrators for Solaris boxes.)

Of course, this may be obvious to you, so I point this out for the benefits of others.

BTW... no significant speed loss has been seen on the 64bit Solaris versions vs the 32bit versions. (And they'll let you downselect a 32bit install if that's what you want, even on 64 bit hardware. So at the OS level, it doesn't seem to make much difference.)

Re:Need to host child porn? by Lord+Kano · 2004-01-23 17:46 · Score: 0, Flamebait

Sure send an email to "postmaster@fbi.gov" and ask to host your child porn in their server farm.

You'll be amazed at how quickly they provide you with support. In fact, if you're lucky they'll send some of their field sales reps straight to your house to work out a deal.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano

retarded. by ArmorFiend · 2004-01-23 17:47 · Score: 0, Flamebait

Nice try, but no, the article is indeed retarded.

They've at best proved a supposition about a single architecture/process/compiler family. They have not proved a general case. Did they test on amd64? Alpha? Mips? No? Then why are they making unwarranted generalizations? Ah, they're retarded.

Re:retarded. by fucksl4shd0t · 2004-01-23 18:09 · Score: 4, Informative

They've at best proved a supposition about a single architecture/process/compiler family. They have not proved a general case. Did they test on amd64? Alpha? Mips? No? Then why are they making unwarranted generalizations? Ah, they're retarded.
Actually, they didn't make generalizations. He very specifically stated that he only tested on a 64-bit Sparc, and an older one at that. He pointed out that while you can make some general conclusions, you can and should run tests on other architectures.
He also pointed out that he only tested a few applications, not a whole bunch of them. He was questioning conventional wisdom and wanted to know if there was any fact behind it, and he determined that there was. He did not determine the entire scope of the facts, and he did not claim to do so.
Sorry, I found it to be an interesting read, but you really have to take the first page seriously when he says "I only tested these things, so I can only conclude based on these tests, and it doesn't prove the general case." If you ignore that, then yes, you'll wind up with what you took away from the article.

--
Like what I said? You might like my music
Re:retarded. by JacobO · 2004-01-24 03:55 · Score: 4, Insightful

I just wonder why some are so offended by the article. I have to believe that some people feel that he has "disagreed" with them or something to have such violent reactions. It's just some benchmarks, as he infers, it's better than some people just supposing the answer to a things they are wondering about.
Re:retarded. by pantherace · 2004-01-24 03:58 · Score: 1

In addition to the first reply's comments (completely correct), there are no 32-bit alphas. So why the heck do you list them in with the others? The Alpha instruction set is the only one I know of that was purely 64-bit. (There probably are a couple of them hidden in EE, CE, CS projects.)
x86-64 (amd64) is the only one of them (at least sparc, mips, power(pc)) where the number of registers jumped to twice the amount(8->16int, 8fp/mmx, 8->16sse). However, some such as the 21264 alpha (3rd gen core (still in use in the 21364s today)) had 80int, and 80fp (if I recall correctly). Also, The opteron/a64 has an onboard memory controller, which they almost certainly got the inspiration from the alphas.
Basically, on systems that have 32-bit subsets that don't enable new features (all except alpha and x86-64) the 64-bit will likely be a bit slower overall, as the article shows. However, x86-64 will be faster in 64-bit mode almost across the board (at least when compilers take advantage of it) and of course alphas are 64-bit only, so they are infinitely faster in 64-bit mode than 32-bit mode :)
Re:retarded. by ArmorFiend · 2004-01-24 05:27 · Score: 1

Alphas have a 32 bit pointer that you can use if you're too stale to recompile your program for 64 bit. How do you think MS was able to port Windows NT to it? :)

Gee, our 32 bit OS doesn't seem to be selling very well on Alpha.

As to all the other little niggly reasons why you can't test cleanly on other architectures ... maybe that's also the reason its a bad idea to make generalizations about 64/32 in the title of the article, while only testing one architecture!
Re:retarded. by Anonymous Coward · 2004-01-24 07:07 · Score: 0

Go and read some articles on OSNews. There's some genuine idiocy there. The last one I remember, he was bitching about the provision in the SunOS liscense that dosen't allow you to build nuclear plants or divices using it. I mean, WTF.

I'm sure that if anyone were planning on using a Sun system for that purpose, they'd be more than happy to sell them a "trusted" version of it (for a heck of a lot more, granted), without that provision.

It's clear he's an idiot, and flamebait.
Re:retarded. by ArmorFiend · 2004-01-24 09:01 · Score: 1

All that is true, in the body of the article. But the title does not have those qualifications, and invites the feeble minded reader to quickly conclude, wrongly, that 32>64.
Re:retarded. by pantherace · 2004-01-24 11:55 · Score: 1

True, and using a 64-bit register with the first 32-bits being zero has the same effect as really being 32-bit. And very true about NT on alpha being really 32-bit (It was *stable* though, a whole lot more than any other windows platform I have used (I and pretty much everyone else haven't used IA-64.)
How long before sizeof(register)?

Old news by t0ny · 2004-01-23 17:54 · Score: 1, Insightful

Ah, this has all been heard before when we switched from 16-bit to 32-bit programs.

The fact is as true as it was then: some applications are going to run faster just because 32-bit compilers are more 'mature'. Once the newer method becomes mainstream, you will see either the same speed, or a gain in speed.

Needless to say, the guy in the other post who stated an anology with an abacus had it right- something small is obviously going to execute faster. We arent switching to 64-bit processors so we can run

10 print "64-bit is k3wl"
20 goto 10

The more complex applications of the future generation, as well as the ability to move large amounts of data from memory to cpu, is what is driving the move.

--

Manipulate the moderator system! Mod someone as "overrated" today.

Re:Old news by Anonymous Coward · 2004-01-23 22:43 · Score: 0

Well, there is a difference between 16 -> 32 and 32 -> 64.

The vast majority of programs on your computer can still run fine with 32 bits, and will continue to do so. Even in 10 years there will still be a lot of programs require > 4GB*.

However, virtually nothing on your machine will work with 16-bits today (64KB). So for architectures like the SPARC, they will always be 32-bit binaries around. It just makes sense.

(*) "Oooh, 4GB enough for anyone, he says?" - well look in /bin, /usr/bin, etc. The day 'ls' takes more than 4GB is the day I write my own version.
Re:Old news by Anonymous Coward · 2004-01-23 23:20 · Score: 0

(*) "Oooh, 4GB enough for anyone, he says?" - well look in /bin, /usr/bin, etc. The day 'ls' takes more than 4GB is the day I write my own version.

You don't have to wait for a theoretical 5GB /bin/ls in the year 2025. You could rewrite your company's database today and let it run well with, let's say, 1GB of RAM.

BTW, 64 bit has other advantages beside the bigger real address space. The bigger virtual address space makes some things easier under the hood of the OS, even if you just use 4GB of RAM. And the wide registers speed up applications, too.
Re:Old news by Anonymous Coward · 2004-01-23 23:35 · Score: 0

Actually I much prefer to see 64 bits used where they are needed rather than terrible hacks involving mmaping into segments of ram and other such crap.

No, a database is a fine thing to be using 64 bits.
Re:Old news by AndroidCat · 2004-01-24 01:35 · Score: 1

I remember when it was going from 8 bit to 16 bit. Since most applications were manipulating 8 bit characters, was 16 bit really going to provide a speed advantage, and was it worth converting a large installed software base? Weren't the extra bits just for bragging rights?
More registers and huge amounts of ram usually carried the argument back then too. I rarely use my 3 MHz 8085 these days. :^)

--
One line blog. I hear that they're called Twitters now.
Re:Old news by steeviant · 2004-01-24 20:25 · Score: 1

Uh, the limit of 16 bit addressing is actually 1Mb.
8 bit is 64Kb.

What else is new. This is about scaling by Anonymous Coward · 2004-01-23 17:54 · Score: 3, Insightful

Adding more/more complex features to a cpu rarely speed it up by itself, however, it might allow the next generation of CPU to scale beyond the current generation.

Both in terms of direct CPU performance and for the software that runs on it.

This has happened a bunch of times during history. Remember the introduction of MMUs for instance? Definately slows down the software running on the machine, but without an MMU we all know that it was virtually impossible to do stable multitasking.

1/2 GB of memory basically the standard these days with XP.

A lot of people are buying home computers with 1 GB or more.

Dell in Japan (where I live) has a special offer these days on a lattitude D600 with 1GB of ram. That is, they expect to sell this thing in quantities.

I think a fair amount of PC users will hit the 4GB limit within a few years. Personally, I already swear about having just 1GB in my desktop at times when I have a handful of images from my slide scanner open in photoshop + the obvious browsers/mail programs and maybe an office program or 2 open.

Introducing 64bit does not make todays HW any faster than their counterparts, but it will make it possible to continue making machines better, faster and capable of handling increasingly more complex tasks.

Re:What else is new. This is about scaling by Anonymous Coward · 2004-01-23 20:41 · Score: 0

http://msdn.microsoft.com/library/default.asp?url= /library/en-us/memory/base/physical_address_extens ion.asp
Re:What else is new. This is about scaling by Anonymous Coward · 2004-01-24 00:17 · Score: 0

Remember the introduction of MMUs for instance? Definately slows down the software running on the machine, but without an MMU we all know that it was virtually impossible to do stable multitasking.
The Amiga never had an MMU and seemed to do better than Windows 3.1 at multitasking, as I remember.

Now I'm confused... by fullofangst · 2004-01-23 17:56 · Score: 1

If AMD market their Athlon 64 3200 as being 10% faster than a generic 32bit Athlon 3200, where is the speed difference coming from? This article seems to imply the only advantage to having 64 bits is being able to sport a greater quantity of RAM.

Re:Now I'm confused... by BiggerIsBetter · 2004-01-23 18:08 · Score: 1

...where is the speed difference coming from?

Marketing.

--
Forget thrust, drag, lift and weight. Airplanes fly because of money.
Re:Now I'm confused... by NerveGas · 2004-01-23 21:01 · Score: 3, Informative

The performance increase comes from a combination of lower memory latency (built-in memory controller) and an increased number of registers. The small number of registers on x86 chips has always been one of the main gripes people have had about the architecture.

steve

--
Oh, you're not stuck, you're just unable to let go of the onion rings.

Slower? It depends. by BobaFett · 2004-01-23 17:56 · Score: 5, Informative

Depends mainly on what data the test is using. If it's floating-point heavy, and uses double, then it always was 64-bit. On 64-bit hardware it'll gain the full-width data path and will be able to load/store 64-bit floating-point numbers faster, all things being equal. If it uses ints (not longs), it is and will stay 32-bit, there will be no difference unless the hardware is capable of loading two 32-bit numbers at once, effectively splitting the memory bus in two (HP-PA RISC can do it, his old Sun cannot, newest Suns can, I don't know if Opterons can). Finally, if the test uses data types which convert from 32 to 64 bits it will become slower, but only if it does enough math on these types. The later is important, since every half-complicated program uses pointers, explicitly or implicitly, but not every program does enough pointer arithmetics compared to other operations to make a difference. However, if it does, then it'll copy pointers in and out of main memory all the time, and you can fit half as many 64-bit pointers into the cache.
That's where the slowdown comes (plus some possible library issues, early 64-bit HP and Sun system libraries were very slow for some operations).
If your process resident memory size is the same in 64 and 32-bit mode, you should not see any slowdown. If you do, it's an issue with the library of the compiler (even though the compiler in this case is the same, the code generator is not, and there may be some low-level optimizations it does differently). If resident size of 64-bit application is larger, you are likely to see slowdown, and the more memory-bound the program is the larger it'll be.

Re:Slower? It depends. by pe1chl · 2004-01-23 22:51 · Score: 2, Informative

That is why I am a bit astonished that he finds a 20% slowdown, then also examines the increased size of the executables, finds it is about 20%, and considers that a minor issue.

I think the 20% increased size is the reason for the 20% worse performance, because memory access is often the bottleneck for real-life programs.

Re: OSNews by S.Lemmon · 2004-01-23 17:57 · Score: 2, Funny

Ha! Shows what you know! Atari-STs came with a built-in 3.5" drive - not a 5.25" so I say nya! to your feeble attempt at computer critic criticism.

HI, OMG LOL LETS COMPILE KERNEL LOL, GET LEARNERS by Anonymous Coward · 2004-01-23 17:57 · Score: 0

PERMIT TOMORROW!?!?!?

seriously guys, this is a joke. Take a computer architecture class, and then shut the fuck up.

Why does slashdot have such a bad rep? Cause of dumb shit like this making it to the front page. Good Job at playing scientist guys, really, let me be the first to congratulate you.

lar lar

Ah well by fullofangst · 2004-01-23 17:58 · Score: 2, Insightful

I figured I would post a comment about AMD and their 64 bit chip benchmarks. Then I realised I was already beaten to it by about eleventy billion other people. Guess I should at least do a FIND through the comments before posting in future!

Re:Ah well by tigga · 2004-01-23 20:20 · Score: 1

eleventy ???
Is it 110?
Re:Ah well by RazzleDazzle · 2004-01-23 21:14 · Score: 1

Perhaps you have not heard the SnL Celebrity Jeopardy episode with Keanu Reeves?

--
ZERO ZERO ONE ZERO ONE ZERO ONE ONE! Just brushing up for my next big invention: Ethernet over Voice (EoV)

Re: OSNews by Anonymous Coward · 2004-01-23 17:59 · Score: 0

And now we see evidence of why people with a clue leave /. They go away, wide-eyed and wondering, and discover how many useful sites there are on the 'net!

Goodbye, Ninwa, and good luck! You shall be missed!

*grin*

Re: OSNews by be-fan · 2004-01-23 18:00 · Score: 5, Informative

On SPARC, there are no 64-bit-only optimizations. The only reason to use 64-bit math is either if you need 64-bit integers, or use 64-bit pointers. Since none of the benchmarks can use either (the MySQL benchmark could, but the machine only had 256MB of RAM).

--
A deep unwavering belief is a sure sign you're missing something...

Yeah, and 16 bit apps are faster even more by Anonymous Coward · 2004-01-23 18:03 · Score: 0

Not to mention 8bit wonders such as Boulder Dash, Kokotoni Wilf ...

This would appear to miss the point... by Bored+Huge+Krill · 2004-01-23 18:04 · Score: 3, Insightful

the tests were all run on a 64-bit machine. The argument is not so much about whether 32-bit or 64-bit binaries run faster, but which is the faster architecture. I'm pretty sure we don't have any apples-to-apples test platforms for that one, though

Re:And 16 bit is slower than 8 bit by iggymanz · 2004-01-23 18:05 · Score: 1

8 whole bits??!!! geez, things sure have gone to pot since we got away from the perfectly servicable 4-bit Intel 4004 with 12 bit addressing. Who the heck needs more than 4kb of addressable memory? Cruft mongers, that's who. And look at the transistor count bloat, 2,300 in the 4004 to how many freaking millions in these 32+ bit hogs.

Re:Well.. MySQL4 loves 64-bit by sirsnork · 2004-01-23 18:06 · Score: 2, Interesting

How exactly did you get 2 x 32bit processors running 64bit code?

--

Normal people worry me!

6502? by tonywestonuk · 2004-01-23 18:09 · Score: 3, Interesting

By what method is a processor judged to be 16,32 or 64 bits?...

The 6502 had 8bit data, but 16 bit address bus, and was considered an '8 bit'
68000 had 16 bits data, 32 bits address - this was a 16 bit

So, why can't we just increase the address bus size of processors, to 64,while keeping the databus size at 32bits. have some new 64 bit addressing modes. The processor can still be 32 bit data, so the size of programs can stay the same....Store program code in the first 4Gigs of memory, (zero page!) , and the pointers within that code can remain 32bits, but have the option of using bigger 64bit pointers to point at data in the remaining 2^63 bytes. This should give best of both worlds of 32vs64 bit.

Re:6502? by thogard · 2004-01-23 23:31 · Score: 1

The 6809 was 8 bit but could cope with 16 bits operations very well. The 8086 was 16 bits but was more comfortable dealing with 8 bits at a time.
Processor bit size is marketing and has been for a long time.

I've been using 64 bit machines since they started to be practical for data centers. I don't see a use of them in most applications. What i see is needed is processor that can quickly cope with a 40 to 48 bit address space as well as 32 bit ints, double precission numbers, and very long word data sets at least 1024 bits if not 32kbit words at a time.
Re:6502? by Anonymous Coward · 2004-01-24 06:50 · Score: 1, Informative

Ever since the Pentium Pro 36 bit address buses have been supported. If you look in your linux kernel options you'll see support for PTE (Page Table Extensions) and more than 4 gig of memory. This allows for up to 64 gig of memory. But the addressing mechanism is a bit messy.

I also *THINK* (cant find anything to back this up) that the operton and itanium are not capable of addressing a full 64 bit address space.

So basically we already increased the data bus size, its a messy solution, it increases page lookup times and you can still only access any 4 gig at once (kind of reminds me of the EMS and XMS systems to allow you to access addresses beyond 1 meg in real mode).

Clean 64 bit implementations will also give a major boost to both integer and floating point performance. See - http://www.digit-life.com/articles2/amd-hammer-fam ily/index.html

If its currently a win32 app... by davegust · 2004-01-23 18:10 · Score: 2, Informative

Address Windowing Extensions (AWE) really are a good solution for your problem.

If you're doing Win32, but really want 64-bit, then consider Win64. There are several OEMs providing it.

If your response is "can't afford it", then your .5 Terabyte database project is probably underfunded and likely to fail.

Re:If its currently a win32 app... by yecrom2 · 2004-01-24 04:13 · Score: 1

Can afford it. Don't want to. Some of our largest potential customers have told us that they won't touch a windows platform. We'd also get more bang for the buck from almost any other 64bit os.

Matt

Re:Well.. MySQL4 loves 64-bit by Tweaker_Phreaker · 2004-01-23 18:12 · Score: 1

How could you possibly run a 64 bit binary on a 32 bit cpu?

That reminds me... by Anonymous Coward · 2004-01-23 18:12 · Score: 0

Score:
Matt 1 - J.Lo 0

Hehehe

Actually that's why CISC isn't that bad by TheLink · 2004-01-23 18:12 · Score: 1

Whilst variable length instructions were frowned upon by all those RISC folk (they liked all instructions to be the same size), given that current memory buses are so much slower than the CPU, all the complexity of popular instructions being shorter isn't really much of a disadvantage, it's a bit like instruction compression to improve bandwidth.

Also smaller code fits in cache better.

--

Too many replies beneath your current threshold

Re: OSNews by Endive4Ever · 2004-01-23 18:13 · Score: 2, Insightful

Solaris isn't any harder. It's just closed source and there isn't anywhere near as much free software avaiable for it. There certainly aren't as many 'guide for the clueless' websites as there are for Linux, needless to say. That can sometimes be a positive thing. To run free software packages, you can try to coerce the Zoularis thing and build software from the NetBSD pkgsrc tree on it, I guess. The interface between 'free software' and Solaris just has a lot more rough edges, in my experience, than running a Free OS on it from the start. I run Solaris on my SS10sx, because there's no free-software X Server for it that supports 24 bit color on it's dual cgfourteen framebuffer, but other than the ability to 'boast' about running Solaris at home, there's not much other reason to run it. I guess that's a status thing, or something.

--
---

AMD64? by TheLink · 2004-01-23 18:17 · Score: 1

Port it to AMD64?

Get a 64 bit O/S and run your stuff in 32 bit so it gets most of its 4GB and the O/S still has its own space for caching etc.

--

Too many replies beneath your current threshold

Re:AMD64? by yecrom2 · 2004-01-24 04:02 · Score: 1

Actually, that's something that I'm working on. I'd like to get what we currently have 64-bit clean. I've got our Apps running on FreeBSD & Linux, both IA32 and I'm trying to get my hands on an AMD64 box or two.

Matt

Re:*Why* do I have that feeling... by fucksl4shd0t · 2004-01-23 18:32 · Score: 3

OK, I'll do that. The compiler does make a difference. I'm just thinking that we don't take the whole picture into account enough here on slashdot.

You're right about everything, except trying to imply, if that's what you were doing, that he used the 'wrong' compiler. In order to test execution speed of 32-bit vs 64-bit binaries, you need to use the same compiler to build the binaries.

See, it gets complicated when you use different compilers. Yes, GCC is likely to build better-optimized binaries for 32-bit. Yes, GCC has a reputation for not optimizing binaries very well in the first place. But if he didn't use the same compiler for both binaries, the results would have been seriously skewed in answering the question. The results would have called into question why he used different compilers, whether or not the different compilers were equal, and so forth.

To answer the question, he needed a compiler that could build both types of binaries to the same level of optimization, no matter how shitty. He wasn't trying to build the fastest binaries on earth, he was trying to build binaries that could be compared to one another in execution speed, using the same source code, and a compiler that would produce the same shitty executable.

That's all. :)

--
Like what I said? You might like my music

Re: OSNews by mikeabbott420 · 2004-01-23 18:39 · Score: 1

Is there a a rule of thumb type process for 64 bit overhead? i.e. if I build my application (C. source, 2 GB memory) in 64 bit mode what can I expect to pay in CPU time? My guess is that the loss will be real but trivial, my applications ,like most, are bound by disk etc. I expect the 64 bit OS will be the determing factor, i.e. does it cache disk better etc.

--
This program was made possible by a grant from the Ultra-Humanite, and viewers like you.

Re: OSNews by Anonymous Coward · 2004-01-23 18:51 · Score: 0

While the article is a curious comparison of a small set of programs on SPARC{32,64}, I think it is far too limited with only one architecture. A better stab at this issue would include AMD64 vs x86, as well as MIPS{32,64}. The AMD64 ABI under Linux shines especially well in this kind of comparison, as the improvements do a good job of offsetting the extra cost associated with 64 bit pointers and longs.

Cleanup on aisle 3! by CaptainCarrot · 2004-01-23 18:55 · Score: 3, Funny

Yeah, like that one time when I tripped over a coax while walking behind a row of Apollo DN660s and yanked it clean out of the connector. Yeesh! Tokens everywhere! I had to get the mop out, and here I was not even in the janitors' union. That by itself could have gotten me fired. As it was I didn't get caught for that, but the network went down and everyone knew it was my fault because of all the squashed token guts on the bottom of my shoes.

We were finding the damn things in the ventilators for weeks afterward.

--
And the brethren went away edified.

Re: OSNews by Anonymous Coward · 2004-01-23 18:58 · Score: 0

You've been trolled. Twice in a row.

Re: OSNews by Anonymous Coward · 2004-01-23 19:07 · Score: 0

Three times, now. Except this time I'll post it as A.C. because it's not worthy of anybody reading it. Like your comment, this one will stay at 0 until the thread is archived, when it'll disappear. The first two times I think I said at least a few things of interest to readers here.

Re: OSNews & The 64 Million Dollar Question by Anonymous Coward · 2004-01-23 19:09 · Score: 0

The 64 million dollar question is:

Why does slashdot link to OSNews articles? Is it some perverted NeoCalvinistic response to feelings of inadequacy and post-adolescent guilt?

Re:And 16 bit is slower than 8 bit by micromoog · 2004-01-23 19:10 · Score: 1

Who the heck needs more than 4kb of addressable memory? Cruft mongers, that's who.

Cruft mongers . . . that's a beautiful term.

OSNews = UnNews? by MrNybbles · 2004-01-23 19:16 · Score: 3, Flamebait

Dant makes a good point about taking a little longer to read 64 bits than 32.

Still, the word size of the processor is not a major factor in now fast a CPU is. Finding fater ways to process instructions, caches, and how fast you run the CPUs at make more of a difference. I am probably leaving out a lot of other major factors. Oh well.

The article is a bit interesting although it seems very amateurish. Just my personal opinion.

In fact the same logic means that with all else being equal an 8 bit processor is slightly faster than a 16 bit processor and a 16 bit processor is slightly faster than a 32 bit processor. But of course all else is never equal so things are usually the other way around.

Has anyone heard of a set (family?) of processors that were exactly the same EXCEPT for the processor's word size?

--
Losing faith in humanity one person at a time.

Re:OSNews = UnNews? by fitten · 2004-01-23 20:28 · Score: 5, Informative

Don't know how they could be exactly the same *except* for the word size. In order to process the two different word sizes, there will have to be differences in circuitry (ALU is wider, so are lots of things like the buffers between pipeline stages and such).

One of the issues that people forget is that a 64-bit processor may be able to retire a set number of 64-bit, say, integer additions per clock cycle (NOTE: retiring an operation per clock cycle does NOT mean that the operation takes one clock cycle to perform). Well, the odds are that it will also retire the same number of 32-bit integer additions per clock cycle. It may take 5 clock cycles to do either sized addition even. So, what do you have that is different? Well, on the SPARC, most simple operations are going to be similar in execution time. Regarless of the number of register windows that the particular architecture supports (which may come into play in some codes), you still basically have 32 registers for use in your computational kernel. The only real difference between many 32-bit and 64-bit versions of the code will be the amount of data that has to be moved around.

Where the 64-bit will help is when the 32-bit code has to synthesize 64-bit operations or has to do things like work on bit streams (not word/byte streams exactly) and can work on 64-bits at a time rather than doing really the same thing on 32-bits two times as much (128 bytes can be traversed in 32 32-bit operations or 16 64-bit operations - half the number of reads/operations).

All of this is pretty well understood by those who have dealt with these type systems before. However, the relative newcomer Opteron has an additional twist. In 64-bit mode, there are twice as many registers that can be used compared to 32-bit mode. This may (read: will) cause some codes to be done faster simply because more data can be stored in registers rather than memory, even L1 cache is a bit slower than a register.
Re:OSNews = UnNews? by llzackll · 2004-01-24 00:39 · Score: 2, Interesting

It shouldn't take longer to read 64 bits than 32 on a 64 bit architecture. Theoretically, a 32 bit machine will read 32 bits on a number of clock cycles, and a 64 bit machine should read 64 bits on the same ammount of clock cycles. This doesn't necessarily mean faster execution times on 64 bit though. A lot of it depends on the compiler, and the OS.

Also, if you just rebuild a an application that was designed around 32 bit in 64 bit mode, you probably aren't going to notice an improvement (if any at all).

I noticed the article used GCC, which probably hasn't caught on with 64 bit yet.. I'm pretty sure SUN has their own compiler, which probably would produce better 64 bit code than GCC on a SUN box.
Re:OSNews = UnNews? by sjames · 2004-01-24 05:51 · Score: 1

Theoretically, a 32 bit machine will read 32 bits on a number of clock cycles, and a 64 bit machine should read 64 bits on the same ammount of clock cycles.

Data path width to main memory and native pointer /integer size has been decoupled for a long time now. As far back as the 8086. Both the '86 and the '88 were 16 bit native machines. The difference was that the '88 had an 8 bit data path and the '86 had 16. The '88 simply performed 2 successive fetch operations (in microcode) for a single register load instruction. These days, most fetches are done in L1 cache line widths regardless of the size of the load instruction.
Re:OSNews = UnNews? by Montreal+Geek · 2004-01-28 16:54 · Score: 1

Only such pair I know of are the 68008 vs the 68000; both are functionally identical (except for some very very minor differences) but the '08 has an 8-bit data bus (vs 16 for the '00).
The '08 was, unsurprisingly, slower whenever memory access was needed (its main use was its very simple E/Q clocking [compatible with 6809] and simple interfacing).
But that was before the days of superscalar architecture and on-die caches, both of which change the bottleneck point.
-- MG

Let's not do anything like that! by iamacat · 2004-01-23 19:45 · Score: 2, Insightful

I guess you didn't have the "pleasure" of using near, far and huge pointers in DOS compilers. In your model, every library function would have to have two versions - one that takes 32 bit pointers and one that takes 64 bit.

Uniform and simple is good...

Re:*Why* do I have that feeling... by Anonymous Coward · 2004-01-23 19:58 · Score: 0

Regardless of whether 32-bit binaries are faster then 64-bit binaries, we'll have to move over as we deal with more and more data.

It doesn't matter that 32-bit is faster, programs are going to get large, required RAM is going to grow, and generally whether 64-bit is better right now is irrelevant because in the evolution of computers, it doesn't matter

Why would we have moved from 16-bit to 32-bit in the first place? And 8 to 16 before that. It's just the evolution of computers as we deal with more and more data.

It's irrelevant and pointless to spend time discussing the speed differences now between 64-bit and 32-bit.

For integers, not floating point by Imperator · 2004-01-23 19:58 · Score: 1

More importantly, an architecture whose registers are 32-bits wide is far less efficient when it comes to dealing with values that require more than 32 bits to express. Many floating point values use 64 bits and being able to directly manipulate these in a single register is a lot more efficient than doing voodoo to combine two 32-bit registers.

Most architectures have separate (64-bit or wider) floating point registers. (For example, IA-32 has 80-bit FP registers.) They never have to use use their general purpose (integer) registers for FP values. So a 64-bit architecture does nothing for FP. It's only important for manipulating 64-bit integer values. You may say "no one will ever need to count beyond 4294967295" but (a) someone does and (b) pointers are integers, and 64-bit pointers are one of the great advantages of a 64-bit architecture. Previously you needed (as you say) voodoo to combine two 32-bit registers and odds are the architecture didn't really have any support for addressing memory that way. Now with a 64-bit architecture you can stick it in a register and do normal operations with it.

--

Gates' Law: Every 18 months, the speed of software halves.

Useless tests by msobkow · 2004-01-23 20:20 · Score: 1, Insightful

No, it's not a test of whether 32 or 64 bit is faster. It's a test of whether an obsolete architecture whose fastest younger siblings are still outperformed by IBM, Intel, and AMD.

The results tell you nothing about whether you should seriously consider 64 bit, nor where you should actually be using a 64 bit setup.

Maybe someone can post the performance results for Doom running on a new AMD 64 bit box with a top-end ATI or NVidia card. It'd be about as relevant as the performance of a SPARC5 is to making a purchase decision.

--
I do not fail; I succeed at finding out what does not work.

Re:64Bit will be needed when Solid State memory co by xeon4life · 2004-01-23 20:30 · Score: 1

If solid state memory as fast as you described ever came out, then there wouldn't be a distinction between memory and storage. In a perfect world they would be one in the same, check out brix-os.sf.net for more info.

--
Real programmers can write assembly code in any language. -- Larry Wall

Liquid Video by scottgfx · 2004-01-23 20:33 · Score: 2, Funny

At a TV station I used to work at, we used to send people on searches for "Liquid Video". Pretty much the same results! It's amazing the people that get hired at TV stations. Mr. Blinker-Fluid would be a genius compared to some in my industry of choice.

At the station I'm at now, they send PA's to ask the engineers for the "ChromaKey for the Genlock". :)

--
It's mandatory to wash your hands before returning to the land of Dairy Queen.

Re:Liquid Video by fucksl4shd0t · 2004-01-23 21:47 · Score: 1

Aha! I just remembered another one. Awhile back my wife and I had a huge fight and we were separated for 10 months, during which time I lived in a swinging bachelor pad. There was this kid in the apartment complex who kept coming over and was really starting to get on my nerves. Well, one day he comes over and asks me for a condom. I had a bunch, hadn't used any of them, and wasn't likely to use any of them in the near future, but the fucker woke me up on my day off to ask me for a condom! SO I told him I'd gone crazy last night, had too much drink, and poked holes in all the condoms. Now that I thought better of it, I didn't want to give him any since they wouldn't work.
I spent 20 minutes arguing with him over whether or not he should use a condom with a hole in it, but the kid never once doubted that I had done it. :)
And I never did get back to sleep, that little bastard. (For the record, when I say 'kid', he was 16)

--
Like what I said? You might like my music
Re:Liquid Video by Anonymous Coward · 2004-01-24 00:32 · Score: 0

I spent 20 minutes arguing with him over whether or not he should use a condom with a hole in it

You should have told him to suck your holy condom covered dick and see if he tastes anything other than stinky latex when you squirt your love juice down his ignorant fucking throat.

He'd be all like, "dude! that's fucked up!" and you'd be all like, "dude! *YOU* are *SUCKING MY DICK*!"
Re:Liquid Video by antiMStroll · 2004-01-24 05:13 · Score: 1

In radio it's asigning the new hire to empty deleted audio from the magnetic tape bulk eraser at the end of the day's shift, a practice we had to drop when enterprising can-do types began dismantling the machine to find the storage tank.

News Flash - Duron takes on all comers.... by twoslice · 2004-01-23 20:36 · Score: 1

Modern processors (which actually stretches back at least 10 years) really want to run out of cache as much as possible

In following your logic, a 128k Duron will run out of cache way before a 2mb Xeon? Making the Duron a better CPU? =)

--

From excellent karma to terible karma with a single +5 funny post...

Re:News Flash - Duron takes on all comers.... by vrt3 · 2004-01-23 21:28 · Score: 1

I think he meant to say that modern processors _tend to_ run out of cache very often, and more and more so since CPU speeds are increasing much faster than memory speeds. I don't think he sees this as a good thing, thoug I admit his way of saying it is quite confusing.

--
This sig under construction. Please check back later.
Re:News Flash - Duron takes on all comers.... by dfung · 2004-01-24 00:11 · Score: 1

Well, I did make the original statement in a confusing way. I should have said "modern processors want to execute out of cache as much as possible". You pretty much never want to run out of cache if you don't have to (it forces you to slow way down to access main RAM)! And you generally want to execute out of cache as much as you can (for the fastest possible execution). Superscalar processors (those with multiple execution units) depend on code and data being in cache for good performance - the pipeline will stall if you have a cache miss which means you stop for what effectively is "forever" waiting for the off-chip memory access. For compute-intensive tasks big caches, burst accesses, and deep pipelines are the fast track to fast computation. There's just no way to crank up the speed of the memory subsystem as quickly as you can crank up the internal wiring of the CPU. For data-intensive apps where information is scattered about with low locality (like a database), this on-chip caching isn't nearly as valuable.

It'd be a pain in the ass... by olePigeon+(Wik) · 2004-01-23 20:38 · Score: 1

It'd be a pain the ass trying to play Karateka at those speeds. :)

16bit is faster than 32bit.......... by UezeU · 2004-01-23 21:09 · Score: 0

nuf said

Re:16bit is faster than 32bit.......... by Anonymous Coward · 2004-01-24 02:16 · Score: 0

The less bits, the faster. Ph34r my 4004

Re:64Bit will be needed when Solid State memory co by evilviper · 2004-01-23 21:20 · Score: 1

In reality, that's just a problem with how we currently handle memory addressing. Harddrives, for instance, are up to about 320GBs, and your 32bit PC can handle it.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Re: 64 bits: no magic by Anonymous Coward · 2004-01-23 22:23 · Score: 1, Insightful

Well, the ALU of 64 bits is a bit slower than of 32 bits, it's more clock cycles per opteration.
A CPU with a lot of slow transistors

is s worse than a CPU with fews quick transistors, so, short paths are better.

The page-translation in long mode is very slow!!!
In Opteron: 4-level for 64 bits VS 2-level for 32 bits, 512*512*512*512-4KiB vs 1024*1024-4KiB, so, legacy mode is quicker than long mode.

And, the cache penalization is a little high:
With 1 MiB of L2 cache, an array of 10'000'000 longs is a bit slower than an array of 10'000'000 ints.

And for building like-LEG0-machines, is better with AthlonXP than with the expensive Opteron.

open4free

Sex appeal? by Alsee · 2004-01-23 22:59 · Score: 1

Why go through all the trouble to make it 64-bit anyway? Other than sex appeal, what other reasons are there for 64-bit?

Alrighty, if someone can show me some Hot Babes who think running 64-bits gives me more sex appeal then I'll run out and buy a 64-bit system right now. I'll take a half dozzen. Hell, make them 128-bit and don't bother me with benchmarks!

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.

Re:Sex appeal? by JollyFinn · 2004-01-24 00:48 · Score: 1

>>Why go through all the trouble to make it 64-bit anyway? Other than sex appeal, what other reasons are there for 64-bit?
Alrighty, if someone can show me some Hot Babes who think running 64-bits gives me more sex appeal then I'll run out and buy a 64-bit system right now. I'll take a half dozzen. Hell, make them 128-bit and don't bother me with benchmarks!
Well you assumed it was hot chicks that was attracted by such sex appeal. I think its more manly thing in this matter ;)

--
Emacs is good operating system, but it has one flaw: Its text editor could be better.

Memory Controller on Chip by turgid · 2004-01-23 23:07 · Score: 1

They get that performance increase on legacy code that doesn't even know about the extra registers by putting the memeory controller on-chip, instead of on the northbridge of the motherboard. This is something that RISC CPUs have been doing for a decade or so. Rumour has it that the Opteron is architecturally very similar to the Alpha.

--
Stick Men

Re: OSNews by BigFootApe · 2004-01-23 23:20 · Score: 3, Insightful

You are correct, although the issues are more subtle than your examples (not hard).

A benchmark is useless without interpretation. The people at OSNews have failed to give us any technical background information on the SparcV chip (penalties running in 64-bit as well as benefits), a proper breakdown of the type of math done by the example programs, as well as analyses of bottlenecks in the benchmarks (MySQL, for instance, is possibly I/O limited).

They've given us raw numbers, with no thought behind them. This is what makes a bad article.

Summary of discussion by Gadzinka · 2004-01-23 23:22 · Score: 2, Informative

It's all in benchmark. It doesn't matter what you benchmark, only what you benchmark with ;)

But there are several points

1. The results for openssl are no good because openssl for sparc32 has critical parts written in asm, while for sparc64 it is generic C.

2. The results would be much better if you did it with Sun's cc, which is much better optimised for both sparc32 and sparc64.

3. The results, even if they were accurate, are good only for sparc32 vs sparc64. Basically, sparc64 is the same processor as sparc32, only wider ;)

I don't know what's the case for ppc32 vs ppc64, but when you look at x86 vs x86-64 (or amd64 as some prefer to call it) you have to take into account much larger number of registers, both GP and SIMD.

As a matter of fact, x86 is such a lousy architecture that it really doesn't have GP registers -- every register in x86 processor has its purpose, other than the rest. It looks better in case of FP and SIMD operations, but it's ints that most of the programs deal with. Just compile your average C code to asm and look how much of it deals with swapping data between registers.

(well, full symmetry of registers for pure FP, non-SIMD operations was true until P4, when Intel decided to penalize the use of FP register stack and started to ``charge'' you for ``FP stack swap'' commands, which were ``free'' before, and are still free on amd processors)

x86-64 on the other hand in 64bit mode has twice more registers with full symmetry between them, as well as even more SIMD registers. And more execution units accessible only in 64bit mode.

But, from this chaotic notes you can already see, that writing good comparission of different processors is a little bit more than ``hey, I've some thoughts that I think are important and want to share''. And the hard work starts with proper title for the story -- in this case it should be ``Are sparc64 binaries slower than sparc32 binaries?''.

Robert

--
Bastard Operator From 193.219.28.162

Re:64Bit will be needed when Solid State memory co by Anne+Thwacks · 2004-01-23 23:25 · Score: 1

Ie shortly after pigs fly.

When we get solid state hard drives and if they're reliable and fast as regular ram then ram will be gone and the SSD will take over.

I remember predicting that SSDs would take over "next year" every year from 1980-1985. Then I gave up smoking.

--
Sent from my ASR33 using ASCII

Well, yes by El · 2004-01-23 23:29 · Score: 0, Redundant

If the 32-bit app fits nicely in cache, but the 64-bit app doesn't, then the 32-bit app could be faster -- for certain problem sets.

--

"Freedom means freedom for everybody" -- Dick Cheney

Re: OSNews by Shanep · 2004-01-24 00:13 · Score: 1

Which brings up a point: both NetBSD/Sparc and NetBSD/Sparc64 will run on an Ultra 1, which is a 64 bit machine. Why doesn't somebody install each NetBSD port on two seperate Ultra 1 machines. Then the benchmark comparision can be between the normal apps that build on both systems, running in parallel on two identical systems. Its exactly the same codebase except for the 32 or 64 bittedness.

Hey there's a good idea. Is NetBSD absolutely exactly the same between the sparc and sparc64 flavours, minus the compile options of 32 vs 64bit?

I use OpenBSD. I would be willing to install OpenBSD/SPARC64 onto my Ultra 10 run some benchmarks, then install OpenBSD/SPARC and run the same benchmarks, then compare.

Anyone know if OpenBSD SPARC64 and SPARC are close enough in code difference to make this worthwhile?

I recently put OpenBSD 3.4 -stable/SPARC64 onto my 333MHz Ultra 10 and compared it with OpenBSD 3.4 -stable on my old 300MHz G3 iBook. The iBook was about 3 times faster than the Sun in ubench, and benchmarks of md5, sha1 and ssl.

Making the OpenBSD -stable release on the G3 (192Mb RAM and slow 6Gb noteboot HDD) took about 6 hours whereas on the Ultra 10 (128MB in one bank) it took about 8-9 hours.

Disk transfer rate is kinda crappy too. I have two 20Gb Seagate IDE drives, one in the Ultra and one in my Thunderbird 700. The Thunderbird gives me a transfer rate of 28Mb/s whereas the Ultra gives 12Mb/s (they are exactly the same model of drive). I realise there might be a lot involved in limiting the IDE performance on this particular Ultra 10, but I would have thought that both systems could saturate the bandwidth available from these old'ish drives. I've heard that the Ultra 5/10's IDE controller sucks, perhaps there is truth to that. Someone claimed that replacing with SCSI is like having a whole different (faster) machine.

You have now intrigued me into trying OpenBSD/SPARC. I am, however, loathe to remove Solaris 9, now that I finally got it installed again. It takes *ages* to install, even from DVD-ROM. It seems that after every thing it installs, it pauses for a set 90 or sometimes 30 seconds. Meaning I have two options, sit there and click to continue, or go away and wait ever longer.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?

Re: OSNews by Shanep · 2004-01-24 00:53 · Score: 1

You have now intrigued me into trying OpenBSD/SPARC.

Actually, forget it. I've just realised, OpenBSD/SPARC probably won't have a kernel that supports the devices in a typical SPARC64 system.

For all I know, OpenBSD/SPARC64 is compiled 32-bit. I shall have to investigate this. I can't beleive this Ultra 10 has been flogged by my little G3.

--
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?

6502? Not a chance by p3d0 · 2004-01-24 01:55 · Score: 1

A 2GHz 6502 would be a screamer.

Ok, let's think about this. For starters, the 6502 needed something like 2-8 clock cycles per instruction. In contrast, the Athlon 64 can execute 3 instructions per cycle.

Besides, you couldn't possibly run a 6502 at 2GHz with today's technology. The chip is not pipelined, so there is way to much logic to complete in each 500ps clock cycle.

Even if you made a new 6502-compatible design that runs at 2GHz, it only had one 8-bit general-purpose register, so to do any useful math would require a lot of arithmetic instructions and a hell of a lot of spills. And where would these spills go? The stack is only 256 bytes.

Want to multiply two 32-bit numbers? Be prepared to do 16 individual multiply operations. And guess what? 6502 has no multiply instruction, so each of those 16 multiplies requires a series of shifts and adds. Plus, the result of each multiply is 16 bits long, so it won't fit in your one GPR. You either need to keep part of the result in memory or in one of the index registers, and either of those is painful. I remember tuning an 8 x 8 -> 16-bit multiply and getting it down to about 200 clock cycles, and that was the best I could do. So, you can expect your 32 x 32-bit multiply to take something like 3000 clock cycles and involve dozens or hundreds of loads and stores.

The moral of the story is, almost every architectural difference between the 6502 and modern CPUs exists for one reason: speed. Still think a 2GHz 6502 would be a screamer?

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Re:6502? Not a chance by Anonymous Coward · 2004-01-24 06:24 · Score: 0

Even if you made a new 6502-compatible design that runs at 2GHz, it only had one 8-bit general-purpose register, so to do any useful math would require a lot of arithmetic instructions and a hell of a lot of spills. ...and two index registers that allowed fast access to 256 off-chip registers iz zero page.

But that's not the point. The point is the 8 bit data bus. Since most 8 bit processors had a 16 bit address bus, I don't see what would preclude an 8 bit processor from having a 64 bit address bus.

It may very well be a major pain in the butt to use, but it's at least a mildly interesting idea.
Re:6502? Not a chance by Brandybuck · 2004-01-24 07:14 · Score: 1

You're missing the point in your focus on reality. Of course they're couldn't be a 2GHz 6502! Play the game of "what if?"

If everything else were the same, an 8-bit CPU would blow the pants off of a 32-bit CPU, for benchmarks that don't need more than 8 bits. Pipeline the 6502, give it more registers, whatever. But keep it 8-bit.

The moral of the story is, almost every architectural difference between the 6502 and modern CPUs exists for one reason: speed.

Of course, but you can create analogous architectural improvements to the 6502. The reduced instruction set of 8-bit opcodes allows a level of parallelism that a 32-bit chip couldn't achieve on the same die space. You could put the entire address space in internal cache.

A 64-bit processor isn't going to be inherently faster than a 32-bit processor, except for code that needs 64 bits.

--
Don't blame me, I didn't vote for either of them!
Re:6502? Not a chance by p3d0 · 2004-01-24 08:33 · Score: 1

If everything else were the same, an 8-bit CPU would blow the pants off of a 32-bit CPU, for benchmarks that don't need more than 8 bits.
Explain it to me: where does the speed come from?
I can think of only one way in which a souped-up 6502 would beat a 32-bit processor: the 6502's pointers are only 16 bits. That makes the data structures smaller. Therefore, if you can find a 6502 program whose data structures fit in 64KB, but don't fit in the 64KB L1 cache of a modern 32-bit CPU, and doesn't suffer from the suffocating lack of registers, and spends most of its time doing 8-bit math, then the 6502 might have an advantage on that particular program.
In other situation, to say the 6502 would beat a moden processor is pure nostalgia.
A 64-bit processor isn't going to be inherently faster than a 32-bit processor, except for code that needs 64 bits.
By similar logic, a 32-bit processor is likely to be inherently faster than an 8-bit processor on any code that "needs" more than 8 bits, which is practically all code.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

OT: The shape of a Pringle by Nighttime · 2004-01-24 02:11 · Score: 1

IANAM, is there a mathematical term for the shape of a Pringle?

IAAM, it would be called a saddle.

--
I've got a fever and the only prescription is more COBOL.

Re: OSNews by mindstrm · 2004-01-24 02:18 · Score: 1

It's absurd because there is no generic "64 bit" or "32 bit" binary... whether they are faster or not is up to individual architectures and implementations.

On a sparc capable of running 64 and 32 bit binaries, sure, it's a valid test, but irrelevant anywhere else. ON an Atlhon-64, the opposite might be true, or they might be the same.

IT shoudl be titled "ARe sparc 64 bit binaries under solaris faster or slower than equivalent 32 bit binaries?"

sizeof(int) by wowbagger · 2004-01-24 02:33 · Score: 2, Insightful

The biggest fault I can see with this test depends upon sizeof(int) -

I don't know about Sun, but in some other environments in which a 32 bit and a 64 bit model exist, the compiler will always treat an int as 32 bits, so as not to cause structures to change size. Hell, even on the Alpha, which was NEVER a 32 bit platform, gcc would normally have:

sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 8

Now, consider the following code:

for (int i = 0; i 100; ++i)
{
frobnicate(i);
}

IF the compiler treats an int as 4 bytes, and IF the compiler has also been informed that the CPU is a 64 bit CPU, then the compiler may be doing dumb stuff like trying to force the size of "i" to be 4 bytes, by masking it or other foolish things.

So, the question I would have is, did the author run a test to insure that the compiler was really making int's and unsigned's be 64 bits or not?

--
www.eFax.com are spammers

68000 by Anonymous Coward · 2004-01-24 02:46 · Score: 0

68000 had 16 bits data, 32 bits address - this was a 16 bit

the 68000 has 16 bits data, 24 bits address and 32 bits internal, hundreds of usenet flamewars couldnt determine which one to use.

id go with 32 bit because thats how many it got from the programmers perspective.

Re:64Bit will be needed when Solid State memory co by Anonymous Coward · 2004-01-24 02:46 · Score: 0

Actually I rembember developing software on a system with non-volitile main memory. It was a DEC PDP-11, back in the late 70's. It's called 'core' memory. It's slow, certainly by today's standards.

Re: OSNews by Anonymous Coward · 2004-01-24 02:47 · Score: 0

Knowing OSNews, though, it wouldn't be an Atari ST at all...

Two Words......"Game Physics" The only reason WE by Anonymous Coward · 2004-01-24 03:17 · Score: 0

would need it for the most part.

Close to reality physics in a game engine
requires some pretty large numbers. 64bit will
help realize this.

Now if we only had REAL video systems instead
of this playground videocard shit they keep feeding us
(At outrageous prices no less)
We would be doing some serious app development.

the old .45 is better than .9mm discussion. by Anonymous Coward · 2004-01-24 03:26 · Score: 0

Back when the initial tests were done. .45 WAS better than .9mm in handguns.
The bullets they used were both hardball.

64bit hasn't gone through the ridiculous end-user
rack yet. Ammo changed, so will 64-bit.

Re: OSNews by kayen_telva · 2004-01-24 03:42 · Score: 0, Offtopic

how is this offtopic ?
he claimed the article was trash and I disagreed.

Re: OSNews by cruelworld · 2004-01-24 03:54 · Score: 2, Insightful

And what if the compiler sucks/has no optimizations for 64-bit binaries?

Re: OSNews by Dave2+Wickham · 2004-01-24 04:20 · Score: 0, Redundant

/me wonders if the "AntisLash" is there purposefully, or just a coincidence...

20 1,024-bit RSA private key ops/s? by Anonymous Coward · 2004-01-24 04:44 · Score: 0

What kind of a shitty processor is this 333 MHz UltraSparc IIi? 20 ops/s is RIDICULOUS! Were you to increase clock frequency by one order of magnitude you would still be left with 200 ops/s, which is still ridiculous: a modern 1.5 GHz Madison processor does 6,000 ops/s.

If this the best UltraSparc can do, Sun might just as well shut down its fabs.

Re: OSNews by Anonymous Coward · 2004-01-24 04:46 · Score: 0

DG Unix was more usable than Solaris to me. Sun jujst doesn't play well with others. All of your prior *nix knowledge doesn't apply and all of your favorite commands don't work or are renamed. I can't wait until they fail.

Re: OSNews by sjames · 2004-01-24 04:50 · Score: 1

Duh. Any developer who's been around 64-bit environments for more than a few weeks knows this. It's not like there's some subtle magic going on here; bigger pointers means more data to schlep around.

Apparently not. Look at all of the 64 bit hype running around! Apparently people do think that 64 bit will magically be faster. This is NOT confined to naive desktop users. I see it all the time amongst scientists who believe that their computational application in FORTRAN will become faster just by running at 64 bits.

64 bits is a minimum for Math/Science by Tony+Hammitt · 2004-01-24 05:11 · Score: 1

I've had a 64-bit desktop for years now, ever since we got our shiny new DEC Alphas back in 199x. We were very happy to finally have native 64-bit floating point so we could use double precision and finally get a correct answer quickly for matrix multiplies where the matrix was bigger than 16x16. 32-bit floating point is just plain useless due to roundoff error and 64-bit is quite a lot better.

On expensive systems, you routinely use 96-bit extended precision or even 128-bit if you actually have matrices that exceed 4096x4096 elements, because even with 64-bit, you lose precision.

So here's to the (new, haha) 64-bit desktops. We'll see a use for the extended precision immediately, since we scientists have seen the need for it for the 10 years it's been around. It may even be useful for gamers...

I can put it more simply by Anonymous Coward · 2004-01-24 05:11 · Score: 0

P4 is a 386 chip. So is Athlon.

To take advantage of the higher number of transistors available nowadays, you alter the architecture. That is all Intel and AMD have done.

If you made a 386 "transistor for transistor" I don't believe it could break 300MHz.

More x86-64 goodness by Z-MaxX · 2004-01-24 05:35 · Score: 1

Besides the fact that x86-64 (Opteron and friends) provides twice as many general purpose registers, and twice as many 128-bit XMM registers, 64-bit mode also enables RIP-relative Data Addressing. From the AMD x86-64 Programmer's Manual Vol 1 page 8:

64-bit mode supports data addressing relative to the 64-bit instruction pointer (RIP). The legacy x86 architecture supports IP-relative addressing only in control-transfer instructions. RIP-relative addressing improves the efficiency of position-independent code and code that addresses global data.

Another small advantage in 64-bit mode is the ability to use the low byte of any of the sixteen GPRs for byte operations. This results in a uniform set of byte, word, doubleword, and quadword registers that is better suited to compiler register-allocation.

--
Dr Superlove 300ml. I use my powers for awesome

So... by Anonymous Coward · 2004-01-24 05:58 · Score: 0

I'm not an expert, but I would think, yeah, you get twice the number of wires.
So that would mean twice the data per clock tick.

But don't froget the apps are going to need the extra 32 bits (32+32=64 :)of data for pointers. etc. etc. etc.

I can't help but chuckleing to myself when I herd all about these "new" 64 bit units. If I rember correctly, "chip people" could have made a 64 bit processors when hardware manufactures could barly keep up with 8 bit units.

Dumb Article - Wrong Conclusion by Anonymous Coward · 2004-01-24 06:03 · Score: 0

If you want to assess the general advantages (or not) of 64-bit computing, then you do the following:

1. Choose functions that take advantage of 64-bit, like floating-point calculations.

2. Choose applications that have been optimized for 64-bit.

3. Compile with a compiler that will optimize for 64-bit.

4. Test on more that one 64-bit platform.

But this author:

1. Chose functions that do _not_ involve a lot of 64-bit operations.

2. Chose functions that have _not_ been optimized for 64-bits.

3. Chose a compiler that is _not_ optimized for 64-bits.

4. Tested on only _one_ 64-bit platform.

And then he concluded that 64-bit computing sucks.

Gee, what a surprise.

Some of the fastest supercomputers and server farms in the world today are using 64-bit platforms. But they're being used for applications that benefit from 64-bit computing, like rendering the graphics in Hollywood movies.

The author talks about the weaknesses of his benchmarks at the end of the article, and I think he's probably being sincere.

At the same time, I am well aware of the fact that Microsoft has been trying to FUD the idea of 64-bit computing. Microsoft knows that Windows is not ready for 64-bit (it may have been released, but the performance sucks), and that anyone who chooses 64-bit hardware these days will end up running Linux or Unix.

A summary of 64 bit costs and benefits by akuma(x86) · 2004-01-24 06:22 · Score: 1

With 64 bits you need to make the critical path of the processor wider (the ALUs, the register file etc...). The makes your circuits slower. You have the lost opportunity cost of frequency -- You could have made the processor run at a higher frequency with a 32 datapath - For those of you with a background in computer arithmetic, think about computing the carry-bit in an adder.

64 bit addressing means that your pointers now take up twice as much space as your old 32 bit pointers. This exerts more pressure on your memory system and makes your L1 and L2 caches appear smaller. It also makes your memory bandwidth appear narrower since you need to ship more data across the bus between the processor and DRAM. You can't say something like - oh the 64 bit processor's bus is twice as wide because you could have made the 32 bit processor's bus twice as wide. In other words, the width of the bus is orthogonal to the instruction set supporting 64 bits. Similar arguments hold for caches.

The benefit of x86-64 is that it provides more registers. This means that there are fewer loads and stores which are used to spill and fill stack variables. This reduces the number of instructions required. It doesn't necessarily speed up the processor because stack variables would live in the store-forwarding buffer and would bypass the cache lookup, but you do have fewer instructions which means less pressure on your instruction cache and lower instruction-fetch bandwidth requirements.

There is a benefit for applications that naturally use 64-bit data types. This is a minor effect because many apps do not need the full 64 bit integer arithmetic. This benefit is also offset by the opportunity cost of frequency described above.

The other big benefit is that you can now address a 64-bit virtual address space which is convenient for accessing large data structures. This is the main reason why processor architects want to move to 64 bits.

The condensed version by Rui+del-Negro · 2004-01-24 06:26 · Score: 1

Yes.

Re: OSNews by Anonymous Coward · 2004-01-24 06:43 · Score: 0

OSNews... Useful?

He'll be back. ;)

bits by Saville · 2004-01-24 06:57 · Score: 2, Insightful

and merely counting bits is no way to estimate performance.

If you only have room for 16k of data in your L1 cache and all your size_t, pointers, and in most cases longs too take twice as much memory at worst it is like you have only 8k of cache now compared to the 32bit version!
At best it is going to make no difference, but at worst it is like your system now has only half the cache and half the memory bandwidth. Seems to me that by counting bits you can estimate your performance will be between 100% and 50% of the 32bit version, all other things equal.
A noteable exception would be when you need a 64bit value and are forced to emulate that.

Re:bits by True+Grit · 2004-01-25 12:48 · Score: 1
1. If you only have room for 16k of data in your L1 cache and all your size_t, pointers, and in most cases longs too take twice as much memory at worst it is like you have only 8k of cache now compared to the 32bit version!
Let me requote the original parent for clarity:
1. In fact, 64-bit architecture means a lot more than pointer size, and merely counting bits is no way to estimate performance.
From the context, the OP was speaking in general terms, and his only point was that the move to a 64 bit architecture from a 32bit one, almost always involves more than just the 32-to-64 switch.

For example, you're claim that the cache is effectively cut in half isn't true with the AMD64 architecture, which can run 32bit software natively (w/o emulation) and 64 bit apps can still work with 32 bit (or less) data if it wants, so only memory pointers in a 64 bit app double in size, but not necessarily the rest of your data. The effectiveness of the cache is still reduced, but not by half, and the other improvements, like the doubling of general purpose registers, and the slight extension of the CPU's pipeline, will offset the added bulk of 2x pointers.

Since AMD64 is an extension of the IA32 architecture, and is still compatible with it, not everything that was 32 bits automatically becomes 64 bits, which is why AMD64 looks so interesting, as you can basically have the best of both worlds, shifting to 64 bits only when it really benefits you (although odds are the extra registers of the AMD64 will probably result in a lot of 32 bit apps seeing better performance if recompiled).

Re: OSNews by Anonymous Coward · 2004-01-24 07:03 · Score: 0

If they do NEED the 64 bit precision, then yes... Obviously, it's going to be faster because they'll have to do about half the clycles to complete the same amount of work.

It's not going to be faster because 64bit computers are some sort of magical beast. That's stupidity.

Of course, most everyone dosen't need a 64bit computer right now. Most motherboards can't even address 2GB of ram, let alone 4, and the smallish-servers that DO use have than 4GBs generally don't use *that* much more, if they even need it.

If 64 bit computers were relegated to huge server farms and supercomputers for a few more years, I wouldn't be dissappointed.

Poor guy... by ShieldW0lf · 2004-01-24 07:11 · Score: 1

Other than sex appeal, what other reasons are there for 64-bit?

This is a man who is NEVER going to get laid.

--
-1 Uncomfortable Truth

This did not test 64-bit versus 32-bit, duh! by Anonymous Coward · 2004-01-24 07:12 · Score: 0

What this tested was 32-bit applications recompiled in 64-bit mode. Which means that the CPU is constantly executing extra code to truncate 64-bit numbers down to 32-bits, which is one-half the reason it is slower and larger. The other half the reason it is larger is because the default size for constants etc. is now twice as big (as well as the space taken for anything actually declared "int", and not wrapped in some compatibility-layer "int32" typedef). And the other half the reason it is slower is because larger code takes longer to load into the instruction cache, and thrashes the cache more.

So the lesson here is, "Don't blindly recompile your 32-bit applications in 64-bit mode, they'll get slower." I have some apps here that make heavy use of the "long long" type that I'd love to try on a 64-bit machine someday. But since that machine in 32-bit mode will probably handle long-longs quickly, 32-bit *may* still be a win.

Re: OSNews by Anonymous Coward · 2004-01-24 07:28 · Score: 0

I just want to point out that you can put 1 gig ram in an Ultra 5 not 512 ram that author claims. There 4 slots for 2 banks of 2 DIMM. You can put 4 256 DIMM in there.. Sun part X7032A..

High Level of Confusion Apparent by DavidStewartZink · 2004-01-24 07:37 · Score: 1

Within my memory, we switched from 8-bit to 16-bit to 32-bit and now to 64-bit computing.

We are now in a transitional period, when 64-bit CPUs exist and can process 32-bit code "natively" (by setting a bit in a CPU control register, without explicit affixes to each opcode or operand specification saying what the operands are 32-bits wide). Yet at the same time, in a 32-bit executable we can still do 64-bit math at the native speed of the 64-bit processor by using operand-size affixes.

In such an environment, expect 32-bit applications compiled into 64-bit executables to be slower, since C programmers (due to C's weakness as a systems programming language) tend to make "int32" typedefs that are processed more slowly in 32-bit mode due to affixes and possibly the need for sign-extension/truncation.

If you have a native 64-bit program (such as a cryptography program that makes extensive use of "int64" ("long long" in 32-bit gcc)) then it may show a speed improvement when a 64-bit executable.

Down the road, expect post-transition CPUs that can only process "int32"s with a slight speed hit.

Benchmarking 16-bit versus 32-bit on today's top end processors would be interesting, though it is important to distinguish between "16-bit applications" and "32-bit applications compiled in 16-bit mode".

Re: OSNews by be-fan · 2004-01-24 07:39 · Score: 2, Interesting

If the compiler sucks, then it would suck equally for 32-bit and 64-bit binaries! They use the same code generator!

--
A deep unwavering belief is a sure sign you're missing something...

Re: OSNews by Anonymous Coward · 2004-01-24 07:45 · Score: 0

That can sometimes be a positive thing

Not necessarily. I remember when i was starting out I'd get really confused because different guides would have conflicting (or at least inconsistent) advice

Re: OSNews by Nothinman · 2004-01-24 07:45 · Score: 1

OpenBSD's sparc64 port is 64-bit but the last time I tried it there was still a lot of work to do. It worked for the most part (it was just a pf box) but a lot of little problems caused me to drop it in favor of Debian/sparc64 again.

And it was a PITA to install since there's no tftp bootable images like there are for Linux.

Re: 64 bits: no magic by Anonymous Coward · 2004-01-24 07:48 · Score: 0

"And for building like-LEG0-machines, is better with AthlonXP than with the expensive Opteron."

I don't think you can call the Athlon 64 3000+ "expensive"...

Regards, Anon. Coward

Re: OSNews by sjames · 2004-01-24 08:06 · Score: 1

Agreed. There is a very valid plae for 64 bit. Of course, the ones I was referring to explicitly just wanted to speed up their current application (which was happily running on 32 bit machines, using single floats).

Re: 64 bits: no magic by Anonymous Coward · 2004-01-24 08:39 · Score: 0

For tomorrow, it will be Athlon32 4800+.

Many actual motherboards only support up to 2 x 512 MiB DDR400 or 2 x 1024 MiB DDR400 using 2 DIMM slots.
But, if it's 3 or more DIMM slots then it down to DDR266 :(, i don't know why!!!

open4free

Bigger register pool, amongst other things.. by Svartalf · 2004-01-24 08:46 · Score: 1

The x86 is register starved- severely so. The AMD64 architechture in 64-bit mode adds a bunch more GP Registers for computational purposes. It's enough to make a boost of up to 30% right there. There's a couple of other things within the architechture which add their own contributions.

All in all, AMD64 is more of an exception than the norm. Normally 64-bit code should be expected to be as fast as 32-bit code at best and only slightly slower at worst. When people talk about the stuff being dramatically slower, they're referring to the increase in memory bandwidth- and they don't take into account that memory is set up in a manner that is not byte-wide. It's typically 32-bits wide and possibly 64-bits wide on some designs. This would translate into a small to moderate performance hit in some designs and none in others.

--
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas

hey! you forgot the link... by SlashDotAgent · 2004-01-24 08:49 · Score: 1

apples -- to -- apples

half-bits are smaller, lighter by rtv · 2004-01-24 08:51 · Score: 1

Why not save space by using 0.5 instead of 1?

This is informative? by Svartalf · 2004-01-24 08:52 · Score: 1

Depends on the size of the bus as to whether or not it'd take longer for 64-bits instead of 32-bits.

If your bus is 64-bits, it won't take any longer than the 32-bits would on a 32-bit machine.

--
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas

Re:This is informative? by Anonymous Coward · 2004-01-24 19:02 · Score: 0

I guess the wording can take was unclear to you?

Wait a minute!! less-bits = more-speed?!? by SlashDotAgent · 2004-01-24 08:57 · Score: 1

If 64bit is slower than 32bit, then...
32bit is slower than 16bit, therefore...
16bit is slower than 8bit, so...
8bit is slower than 4bit, which means...
4bit is slower than 2bit, in which case...
2bit is slower than 1bit, which concludes that...
1bit is faster than 64bit!!!

Re:Wait a minute!! less-bits = more-speed?!? by SuiteSisterMary · 2004-01-28 03:18 · Score: 1

Exactly right.

The trade-off, however, is what the bits themselves can do.

Is it faster to move 1 bit 32 times, or 32 bits once?

This is the rationale behind Serial ATA hard drives; it's faster to just blast the bits than to waste time syncing them to go in parallel.

--
Vintage computer games and RPG books available. Email me if you're interested.

OpenSSL by Firethorn · 2004-01-24 09:22 · Score: 1

OpenSSL might be kind of arbitrary

Actually, I'd say it's a good choice. After all, it does involve mathematical operations on large numbers. Many websites use it on their servers for sales, passwords, and account access, as well as other security concerns.

The computing power demands of one SSL connection might not be much, but when you get into hundreds of connections, this gets to be a major strain on hardware. If going 64 bit reduces the number of cycles needed to process a thread, it can reap major benefits.

This seems to be a better choice than gzip and MySQL, as gzip is often assosiated with fetching things from off-board storage (network/HD/something else slow), And MySQL is often memory limited as you're messing with massive amounts of data. So your left with doing complicated things to a data set that will fit in main memory. I think that what you're doing, and what kind of data you're working on, being a large factor in the results (if the data's 32bit or smaller, the 32bit app might take the lead, where if the data's larger than 32 bit, the 64bit app takes the lead).

--
I don't read AC A human right

What purchase decision? by Valdrax · 2004-01-24 09:57 · Score: 3, Interesting

This is modded Insightful?

You've completely missed the entire point of the test. This has nothing to do with your next purchase decision -- it's purely designed to test whether or not the common claim that using 64-bit values decreases performance due to memory latency is true. This test makes no claims whatsoever that it has anything to do with whether or not you should be using a 64-bit setup. RTFA.

The "obsolete architecture" is one of the few where 64-bit and 32-bit operations have no inherent performance advantage on the processor, unlike the Opteron and Itanium processors where 64-bit mode has several advantages over 32-bit mode (extra registers or not being emulated). This makes it a perfect testbed for evaluating this claim. The speed of the processor has absolutely no relevance to the question at hand (with the exception of testing memory access starvation on system with a greater CPU to bus clock difference).

It's a shame you're too wrapped up in a "buy, buy, buy" mindset to consider the value of curiosity and of testing commonly held beliefs.

--
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").

Re:What purchase decision? by msobkow · 2004-01-28 04:42 · Score: 1

Please spend some time researching the difference between parallel channel memory controllers, bus architectures, and internal CPU caches before you continue on this line of thinking that the tests are relevant. "Buy" is not the mindset, but "available" certainly is. If you want to test "commonly held beliefs" you don't do so by using a special fringe case that is relevant to less than 5% of the real world market. Perhaps next they can evaluate a Model T and "demonstrate" that it doesn't have the top end and horsepower of a modern sedan, but more wheel torque for use as a tractor.

--
I do not fail; I succeed at finding out what does not work.

Re: OSNews by FyRE666 · 2004-01-24 11:05 · Score: 1

Well I'm not sure if it's any sort of relevant benchmark, but running the OGR benchmark of dnetc on the Ultra10 I used to use at work showed it could crunch gnodes *almost* as quickly as an AMD K6 500. I was surprised at that since it actually felt slower to use (the AMD setup was running Linux, BTW)...

--
Code, Hardware, stuff like that.

Re:*Why* do I have that feeling... by EndlessNameless · 2004-01-24 12:39 · Score: 1

:::It's irrelevant and pointless to spend time discussing the speed differences now between 64-bit and 32-bit.:::

Unless you need to make a decision on which machines to deploy in the near future...

It does matter, just not to most people, and probably not even to the people who are most rabidly touting 64-bit computing **(cough)**Athlon64fanboys**(cough)**.

--

---
According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.

Re:64Bit will be needed when Solid State memory co by Anonymous Coward · 2004-01-24 12:56 · Score: 0

Funny, isn't that the idea behind IBM AS(OS)/400 ?
Not exactly a novel idea.

Re: OSNews by kayen_telva · 2004-01-24 15:29 · Score: 1

oh wait now its redundant. even though noone else disagreed with him. wtf are the moderators smoking ?

64 versus 32 bits FUD? by stock · 2004-01-24 17:19 · Score: 1

Is this dude sponsored by Intel? Why would someone go through these kind of efforts to find out a 64bit app is a tiny bit slower as the 32bit version? Maybe some people were shocked to find Opteron on 64bit was a lot faster as when running in 32bit mode. Why would someone create FUD about 64bit being slower as 32bit when Opteron currently is pulling _all_ bricks out of Intel's backyard??

remember this quote? :

"Windows [n.]
A thirty-two bit extension and GUI shell to a sixteen bit patch to an eight bit operating system originally coded for a four bit microprocessor and sold by a two-bit company that can't stand one bit of competition."
(Anonymous USEnet post)

Here's another one :

"Itanium [n.]
a.k.a. Itanic. An incompatible sixty-four bit extension to a thirty-two bit Pentium 4 CPU created by a company who's previous CPU was called Pentium 5 and presumably also cannot count upwards in performance."

Robert

Re: OSNews by Anonymous Coward · 2004-01-24 23:18 · Score: 0

How do I feel? Like I'm stuck in a bad Christopher Lambert movie.

wow, that's REALLY bad!!

Re: OSNews by Anonymous Coward · 2004-01-25 05:07 · Score: 0

Hey, I've still got my Atari ST, and it cool. They most certainly do not suck.

Amiga's do suck though.

Re: OSNews by scottj · 2004-01-25 12:06 · Score: 1

OpenSSL might be kind of arbitrary

OpenSSL is certainly not arbitrary. It probably stands to benefit most from the 64-bit processor.

--
.-.--

Doing a real benchmark... by aphor · 2004-01-26 03:22 · Score: 1

I disagree with the analysis.

Specifically, I tested my own sparcv9 OpenSSH on sparcv9 OpenSSL compiled with GCC-3.2.1. I got different results, and I have a different conclusion. My needs are to scp 8GB database dumps from one host to several others. I was interested in two factors. First, if run time was significantly different, that would be a compelling factor. That factor accounts for economy in scarce maintenance window time which I can safely saturate the network. Second, I was interested in the amount of system resources consumed during the process.

My results were different from the OSNews guy, but then again I am not intimidated by gcc make bootstrap. My sparcv9 OpenSSL libcrypto was carefully configured and compiled to use as much of the sparcv9 assembly code as the OpenSSL project provided. Both of my hosts were Sun Fire V880s with 4x750MHz Ultrasparc-III CPUs running 64 bit Solaris 8. SSH was configured to use the Blowfish symmetric cipher because it is not as CPU expensive as 3DES or AES in software. The disk volumes of the source and sink were EMC Symmetrix volumes over 1Gbps FC SAN using Emulex 900x host adapters. Over several trials, neither 64 nor 32 bit SSH could claim a run time advantage. However, the user-CPU time (number of cpu cycles spent executing SSH code as opposed to time spent waiting for IO service times or kernel/syscalls to complete) for the Sparcv9 SSH was slightly more than half of the same statistic for the Sparcv7 binary.

Admittedly, there is a difference in my benchmark test cases aside from instruction or data word length. Sparcv7 has no integer multiply or integer divide instructions, so it must rely on long-division and repeated adding to achieve the results of those operations. I needed a sparcv7 binary because I still have one old sparcv7 box in service that needs SSH. Because of the cache hit ratio when doing that kind of math, and the fact that Blowfish does not rely heavily on those operations, I doubt that these instructions would account for the difference in CPU cycles required to process this 8GB file.

Confirming this is difficult because the sparcv8 ABI was a 32 to 64bit crossover architecture in which the OS would interperet and optimize some operations if they were running on a 64 bit Solaris (2.7 or 8) kernel. Thus, a 32 bit sparcv7 binary running on sparcv9 hardware and a sparcv9 kernel may actually be partially executed in 64 bit mode, and be efficient at it. If there are any sparc ABI guru comments to enlighten us on the specifics, those comments would be appreciated. I believe a more rigorous test would verify that the kernel running the 32 bit userland software was a 32 bit kernel.

I conclude that the large data word size allowed SSH to encrypt the 8GB file in nearly 50% fewer CPU cycles, and therefore is faster for this type of operation. Aside from that, note that RSA/DSA operations on the sparcv7 binaries are SIGNIFICANTLY slower, probably due to the multiplication and division required by those algorithms. I'm sorry I wasn't doing this benchmark for publication or I would have saved my data/results.

--
--- Nothing clever here: move along now...

Loaded Question by MichaelKaiserProScri · 2004-01-27 03:33 · Score: 1

It depends on what you need to do. On a 64 bit processor you MUST sling around 64 bits per clock cycle. If you need to, like in encryption applications or in a well written database engine, then it will be faster. If you only need 32 bits, then you are slinging around 32 bits of junk on every tick and will see no benefit. Simply recompiling an application which uses 32 bit aligned structures as 64 bit will cause a performance decrease. But redesigning those structures top be 64 bit alligned will cause a performance increase. Another point of confusion is that most 64 bit processors introduce some form of VLIW (Very Long Instruction Word). VLIW lets the compiler, rather than the processor, sort out the dependancies between instructions and decide how to load the pipelines. The effectiveness of this is compiler dependent and will improve as the compiler technology grows to fit the new processors. VLIW is not inherently a feature of 64 bit processors, but the 64 bit processors are better able to handle the long instructions,

Slashdot Mirror

Are 64-bit Binaries Slower than 32-bit Binaries?

444 comments