I don't know if you can purchase the quad cpu board, use one cpu and all sixteen slots or not, but I think it will work.
No, it won't. The memory controller is in the processor. If you don't populate a processor socket, you can't use the DIMM banks attached to that socket.
The article cites disk caches as a source of data-loss.
They claim that their battery-backed RAID caches were safe, but that the actual drives themselves were performing unsafe write cacheing. It strikes me that this is the kind of thing that's quite easy to *suggest*, but far more difficult to *prove*.
I don't have any first-hand knowledge of disk corruption due to write-caching. Is this a real problem or just some kind of legend? Can someone who has RTFA'ed and knows about disk caches please comment?
This is somewhat irrelevant, but I've messed with some non-battery-backed RAID setups in the past. In these situations, it always made sense to me that the controller would set the individual drives' cache policy to match its own.
Out of this World (aka "Another World" in Europe) was truly an amazing game. I'm somewhat of an OOTW historian. Here's some interesting stuff for those who care:
The original game was released in a PC version which happens to work quite well in DOSEmu.
There is a sequel, called "Heart of the Alien" that was only avalible on SegaCD. MobyGames link: http://tinyurl.com/49s8w
It turns out that the OOTW engine is actually an interpreter that "plays" the game data files. One of the authors of the popular ScummVM software wrote a new interpreter called "RAW" (for rewritten another world) but subsequently took it down under pressure from the original author. It's too bad, because it worked quite well (at least on the European "Another World" data files) on both Windows and Linux.
It does have a 64bit math library however, which exposes the most important functions of the 64bitness of the G5.
This is a common misconception. The most important function of any 64-bit processor is the ability to address more than 4GB (2^32 bytes) of memory. Everything else is icing on the cake.
Case-in-point: It's not like 64-bit math is impossible without 64-bit registers. I just opened the calculator app on my 32-bit machine and did 2*(2^32) no problem.
WOW64, if you're not familiar with the acronym, means windows on windows 64. It's basically their "emulator" (it's more of an interpreter) to run code not compiled for 64 bit. Instead of going the FreeBSD route and allowing for both 32 and 64 bit programs to run at the same time (props for freebsd), Microsoft decided to go with an emulator - which happens to suck horribly, and freeze alot.
Lies.
Windows and FreeBSD both do exactly the same thing, which is to let 32-bit programs run at full-speed, natively, on the cpu. Practically the whole point of AMD64 architecture is backwards compatibility. The world didn't need another Itanium.
maxim: cycles are cheap, people are expensive. For the *vast majority* of software it is significantly better value to design and build a well architected OO solution than to optimise for performance in languages and methodologies that are more difficult to implement and maintain. Who cares if it's not very efficient - it'll run twice as fast in 18 months, and will be a lot cheaper to change when the client figures out what the actually wanted in the first place. But I guess you already knew that.
I couldn't possibly agree more with this. Hits the nail right on the head.
...I consider that as one of the better arguments against OO code - It simply does not map well to real-world CPUs...
This on the otherhand, couldn't be more incorrect. I'm not sure I'll ever understand the motivation of people who post this crud. How can anyone know so little, and not know that they know so little?
Object oriented code is just as fast as anything else. If you don't believe me, listen to Stroustrup: Learning Standard C++ as a New Language.
I'm a software engineer and I work for a major CPU manufacturer. As you might guess, my job involves a ton of assembly progamming.
That being said, I disagree.
You can learn CS concepts in many ways. It's cute to learn from the bottom up, but it's impractical. I oppose it for the same reason I oppose CS curricula based on underdog languages (like Eiffel, to name one I was taught). I don't care how 'clean' they are, teach something useful.
If runtime efficiency matters, you'll know about it and eventually get down to the assembly level. If it doesn't matter, for the love of god please "optimize" for something that does: like readability, maintainability, extensiblity, portability, modularity, test-ability, etc.
Crappy fast software is still crappy software.
What we're really dealing with here is complexity, and how to manage it. Software engineers design complex programs for complex hardware. You can't possibly know every detail. Thanks to the magic of abstraction, you don't have to.
I have to agree. What really signals that Itanium is doomed is the fact that no one is buying it.
But you gotta dig the irony: Intel is making an AMD-compatible processor.
One seriously cannot underestimate the significance of binary compatibility. Nowadays The external ISA is a silly detail anyway. Any processor worth the silcon it was made on has a RISC microarchitecture.
This is the equivalent of the Intel SSE/MMX instructions but the MVI instructions had direct access to memory as they were 64 bits wide just like any other instruction on the chip.
This sentence doesn't make a whole lot of sense...
Data size != address size. Case in point: SSE and SSE2 operate on 128-bit data. They work on 32-bit P4's as well as 64-bit Athlon64's. The address tells you where to go to find the data. How much data you get once you're there is another story.
The issue of "direct access to memory" really has nothing to do with data width. x86 has memory-indirect addressing modes as a virtue of it's CISC-ness. This applies to MMX, SSE, SSE2, and even 3DNow. In this sense, they have direct access to memory. I'm not familiar with Alpha assembly language, but I'd bet $10 that they don't have these addressing modes (RISC architectures generally don't, by design).
...90% of hardware improvements are essentially wasted by programmer inefficiency.
While this may be true, it's largely done on purpose.
Professional programmers are in the business of making tradeoffs: time versus space, speed of execution versus speed of development, etc.
While it's true that a crack team of assembly programmers could probably rewrite the whole of MS Office for optimum performance, chances are:
1) It would take them years. 2) Users would hardly notice a difference ("Wow, the about box comes up in 100 ms instead of 500!") 3) The code would be impossible to maintain.
Nowadays, professional programmers who are working on performance-critial software tend to write first and optimize second (after they profile the code to determine where 'hotspots' are).
Just look at 'write-once-run-anyware' languages like Java or.Net. Byte code/virtual machines eliminate the need to port our application 50 times, but in trade we give up a whole bunch of speed. If speed doesn't matter, it's all upside.
Zoid's only mistake was attempting to match the physics of a "listen" server, not the "dedicated" servers that all the pros played.
I could dive into the subtleties of sys_tic_rates, etc, but suffice to say that no self-respecting pro would ever play on a "listen" because the physics got all goofed-up.
QuakeWorld was a nice attempt at making things easier for HPBs, but it never took hold in the pro scene. All the real tournies were, and still are, LAN events anyway.
Excuse me if this is a stupid question. I couldn't find the answer anywhere...
How does Lexmark know that Static Control made its interoperable chip thru legal means? Static Control could have just cracked the chip open, stuck it under a microscope and ripped-off the entire design.
Obviously, a clean-room reverse engineer is legal. There is tons of precedent to that effect. Even the DMCA has exemptions for it.
Perhaps Lexmark has some reason to believe Static Control illegally copied their chip?
There's only one channel, which happens to have twice the data width.
OK finally I think we've gotten to the bottom of this. Hardware purists may want to refer to Opteron's memory controller as "a single 128-bit wide DDR channel" rather than "dual channel DDR" because technically it's not.
I'm not well-informed as to the performance differences between 2x64b and 1x128b DDR busses. In theory they have the same peak bandwidth.
The announced Opteron parts do not have dual DDR memory channels
I don't know where this crap is coming from. Certainly not from the document referenced in the parent post.
A DIMM is 64-bits wide. The Opteron has a 128-bit wide memory bus, which means you need to use pairs of DIMMS much like the older P4's with Rambus memory.
There are plenty of pins for this in the 940 package. The block diagram on page 11 of the the data sheet even shows the 128 MEMDATA pins.
The memory controller is configurable to support a 64-bit memory bus (probably for desktop or mobile versions of the part), but in all the systems I've used you can't even boot with an odd number of DIMMS.
Now you can decide for yourself if a 128-bit wide DDR bus is "dual channel" or not. I'm not going to argue semantics. I am, however, going to do the math and tell you that the Opteron paired with DDR333 provides 128*333/8 = 5328 MB/s of some seriously low-latency bandwidth. Oh yea and it scales with the number of processors too.
DISCLAIMER: I work at AMD but I am not speaking on behalf of the company.
Orcas supports _InterlockedCompareExchange128 (CMPXCHG128). It should be in the Beta:. aspx
http://msdn2.microsoft.com/en-us/vstudio/aa700831
I don't know if you can purchase the quad cpu board, use one cpu and all sixteen slots or not, but I think it will work.
No, it won't. The memory controller is in the processor. If you don't populate a processor socket, you can't use the DIMM banks attached to that socket.
The article cites disk caches as a source of data-loss.
They claim that their battery-backed RAID caches were safe, but that the actual drives themselves were performing unsafe write cacheing. It strikes me that this is the kind of thing that's quite easy to *suggest*, but far more difficult to *prove*.
I don't have any first-hand knowledge of disk corruption due to write-caching. Is this a real problem or just some kind of legend? Can someone who has RTFA'ed and knows about disk caches please comment?
This is somewhat irrelevant, but I've messed with some non-battery-backed RAID setups in the past. In these situations, it always made sense to me that the controller would set the individual drives' cache policy to match its own.
Out of this World (aka "Another World" in Europe) was truly an amazing game. I'm somewhat of an OOTW historian. Here's some interesting stuff for those who care:
The original game was released in a PC version which happens to work quite well in DOSEmu.
There is a sequel, called "Heart of the Alien" that was only avalible on SegaCD. MobyGames link: http://tinyurl.com/49s8w
Gens is an emulator for Windows and Linux that can play the ROM, should you be able to find it: http://gens.consolemul.com/downloads.shtml
It turns out that the OOTW engine is actually an interpreter that "plays" the game data files. One of the authors of the popular ScummVM software wrote a new interpreter called "RAW" (for rewritten another world) but subsequently took it down under pressure from the original author. It's too bad, because it worked quite well (at least on the European "Another World" data files) on both Windows and Linux.
I've recently heard it referred to as "RAS Syndrome" or, Redundant Acronym Syndrome Syndrome.
It does have a 64bit math library however, which exposes the most important functions of the 64bitness of the G5.
This is a common misconception. The most important function of any 64-bit processor is the ability to address more than 4GB (2^32 bytes) of memory. Everything else is icing on the cake.
Case-in-point: It's not like 64-bit math is impossible without 64-bit registers. I just opened the calculator app on my 32-bit machine and did 2*(2^32) no problem.
WOW64, if you're not familiar with the acronym, means windows on windows 64. It's basically their "emulator" (it's more of an interpreter) to run code not compiled for 64 bit. Instead of going the FreeBSD route and allowing for both 32 and 64 bit programs to run at the same time (props for freebsd), Microsoft decided to go with an emulator - which happens to suck horribly, and freeze alot.
Lies.
Windows and FreeBSD both do exactly the same thing, which is to let 32-bit programs run at full-speed, natively, on the cpu. Practically the whole point of AMD64 architecture is backwards compatibility. The world didn't need another Itanium.
WOW64 Implementation Details
From the Top500 List for November 2003:
Earth Simulator - 5120
LANL / ASCI Q - 8192
LLNL / ASCI White - 8192
NERSC / LBNL / Seaborg - 6656
Nice research, BBC.
maxim: cycles are cheap, people are expensive. For the *vast majority* of software it is significantly better value to design and build a well architected OO solution than to optimise for performance in languages and methodologies that are more difficult to implement and maintain. Who cares if it's not very efficient - it'll run twice as fast in 18 months, and will be a lot cheaper to change when the client figures out what the actually wanted in the first place. But I guess you already knew that.
...I consider that as one of the better arguments against OO code - It simply does not map well to real-world CPUs...
I couldn't possibly agree more with this. Hits the nail right on the head.
This on the otherhand, couldn't be more incorrect. I'm not sure I'll ever understand the motivation of people who post this crud. How can anyone know so little, and not know that they know so little?
Object oriented code is just as fast as anything else. If you don't believe me, listen to Stroustrup:
Learning Standard C++ as a New Language.
I'm a software engineer and I work for a major CPU manufacturer. As you might guess, my job involves a ton of assembly progamming.
That being said, I disagree.
You can learn CS concepts in many ways. It's cute to learn from the bottom up, but it's impractical. I oppose it for the same reason I oppose CS curricula based on underdog languages (like Eiffel, to name one I was taught). I don't care how 'clean' they are, teach something useful.
If runtime efficiency matters, you'll know about it and eventually get down to the assembly level. If it doesn't matter, for the love of god please "optimize" for something that does: like readability, maintainability, extensiblity, portability, modularity, test-ability, etc.
Crappy fast software is still crappy software.
What we're really dealing with here is complexity, and how to manage it. Software engineers design complex programs for complex hardware. You can't possibly know every detail. Thanks to the magic of abstraction, you don't have to.
No, this does not signal that Itanium is doomed.
I have to agree. What really signals that Itanium is doomed is the fact that no one is buying it.
But you gotta dig the irony: Intel is making an AMD-compatible processor.
One seriously cannot underestimate the significance of binary compatibility. Nowadays The external ISA is a silly detail anyway. Any processor worth the silcon it was made on has a RISC microarchitecture.
Who mods this crap up?
No matter how hard I "imagine", I still can't write my masters thesis on a four hour plane ride without my battery crapping out.
This sentence doesn't make a whole lot of sense...
Data size != address size. Case in point: SSE and SSE2 operate on 128-bit data. They work on 32-bit P4's as well as 64-bit Athlon64's. The address tells you where to go to find the data. How much data you get once you're there is another story.
The issue of "direct access to memory" really has nothing to do with data width. x86 has memory-indirect addressing modes as a virtue of it's CISC-ness. This applies to MMX, SSE, SSE2, and even 3DNow. In this sense, they have direct access to memory. I'm not familiar with Alpha assembly language, but I'd bet $10 that they don't have these addressing modes (RISC architectures generally don't, by design).
-turm
...90% of hardware improvements are essentially wasted by programmer inefficiency.
.Net. Byte code/virtual machines eliminate the need to port our application 50 times, but in trade we give up a whole bunch of speed. If speed doesn't matter, it's all upside.
While this may be true, it's largely done on purpose.
Professional programmers are in the business of making tradeoffs: time versus space, speed of execution versus speed of development, etc.
While it's true that a crack team of assembly programmers could probably rewrite the whole of MS Office for optimum performance, chances are:
1) It would take them years.
2) Users would hardly notice a difference ("Wow, the about box comes up in 100 ms instead of 500!")
3) The code would be impossible to maintain.
Nowadays, professional programmers who are working on performance-critial software tend to write first and optimize second (after they profile the code to determine where 'hotspots' are).
Just look at 'write-once-run-anyware' languages like Java or
NetQuake forever: http://www.planetquake.com/proquake/
Zoid's only mistake was attempting to match the physics of a "listen" server, not the "dedicated" servers that all the pros played.
I could dive into the subtleties of sys_tic_rates, etc, but suffice to say that no self-respecting pro would ever play on a "listen" because the physics got all goofed-up.
QuakeWorld was a nice attempt at making things easier for HPBs, but it never took hold in the pro scene. All the real tournies were, and still are, LAN events anyway.
Excuse me if this is a stupid question. I couldn't find the answer anywhere...
How does Lexmark know that Static Control made its interoperable chip thru legal means? Static Control could have just cracked the chip open, stuck it under a microscope and ripped-off the entire design.
Obviously, a clean-room reverse engineer is legal. There is tons of precedent to that effect. Even the DMCA has exemptions for it.
Perhaps Lexmark has some reason to believe Static Control illegally copied their chip?
There's only one channel, which happens to have twice the data width.
OK finally I think we've gotten to the bottom of this. Hardware purists may want to refer to Opteron's memory controller as "a single 128-bit wide DDR channel" rather than "dual channel DDR" because technically it's not.
I'm not well-informed as to the performance differences between 2x64b and 1x128b DDR busses. In theory they have the same peak bandwidth.
Thanks for the spirited debate, Eric.
The announced Opteron parts do not have dual DDR memory channels
I don't know where this crap is coming from. Certainly not from the document referenced in the parent post.
A DIMM is 64-bits wide. The Opteron has a 128-bit wide memory bus, which means you need to use pairs of DIMMS much like the older P4's with Rambus memory.
There are plenty of pins for this in the 940 package. The block diagram on page 11 of the the data sheet even shows the 128 MEMDATA pins.
The memory controller is configurable to support a 64-bit memory bus (probably for desktop or mobile versions of the part), but in all the systems I've used you can't even boot with an odd number of DIMMS.
Now you can decide for yourself if a 128-bit wide DDR bus is "dual channel" or not. I'm not going to argue semantics. I am, however, going to do the math and tell you that the Opteron paired with DDR333 provides 128*333/8 = 5328 MB/s of some seriously low-latency bandwidth. Oh yea and it scales with the number of processors too.
DISCLAIMER: I work at AMD but I am not speaking on behalf of the company.