Linux has been running on Alpha for quite a while
on
Is SMT In Your Future?
·
· Score: 1
In fact, folklore has it that Alpha was the first non-x86 architecture that Linux was ported to. I'm not sure if this is true or not, but I do know that Linux SCREAMS on Alpha...
Of course, you probably meant the EV8 in particular, not the Alpha in general--to which I reply that it should be able to run the same Linux that runs on EV6's, although there might be some changes to fully take advantage of SMT.
There's two competing goals for next generation processors.
One is to create the absolute fastest processor for running a single thread--and this requires many-way issue and lots of functional units to capture all the instruction level parellelism. The other is to be able to maximize the throuhput of all the threads running on the system (i.e. capture all thread-level parellelism).
The first goal would lead you to create a high clock rate, highly out-of-order speculative cpu. The latter goal would lead you to do chip multi-processing with smaller, simpler cores. The compromise is to do SMT.
Once you have a powerful SMT processor, there is nothing to prevent you from putting multiple of these on a die in future generations (i.e. SMT and CMP are not mutually exclusive).
I'm so frikin sick of hearing about how the Transmeta CPU can morph into any architecture on the planet. This is just patently false. Even if you were just joking, there are many out there that believe this crap.
A dual channel DDR has 4x the number of data pins and traces as a dual channel Rambus. Why is everyone so unwilling to compare apples with apples? Try 4-8 channels of RDRAM vs. 2 channels of DDR, and see how your math comes out. God, this forum is ridiculous.
>>>1. No matter HOW they try to spin it with Perot-Esque charts, current RAMBUS designs will never have the potential speed of DDR, and probably never will. Pumping data really really fast 16-bits at a time just simply can't compete with pumping data not quite as fast (yet) 64-bits at a time. RAMBUS is like figuring out a way to run a `286 really really REALLY insanely fast, but it's still only 16-bits.
That's ridiculous. With 16 data pins, PC800 RDRAM produces (800Mhz*2bytes=) 1.6GB/s of bandwidth. At 133Mhz, with 64 data pins, SDRAM produces (133Mhz*8bytes=) 1.064GB/s of bandwidth. Now, if you give both the same number of data pins (64), then you have 4 channels of RDRAM for 6.4GB/s compared to one bus of SDRAM for 1.064GB/s. If you look at all the pins necessary to make the interfaces work, and give the same amount of pins to both, then you are still looking at a 3:1 ratio of bandwidth for RDRAM vs. SDRAM.
>>>Because of these limitations, RAMBUS has been proven time and again to be slower than even CURRENT PC-133 SDRAM. Even Intel unwittingly published proof of this. DDR vs RDRAM shouldn't even be a contest.
Where has this been proven 'time and time again'? All you've seen is Intel's butchering of Rambus's protocol. They used two channels of RDRAM, but they piped both of them through a front side bus that had lower bandwidth than a single Rambus channel. Please don't measure the technology based on Intel's crappy implementation. Maybe their Willamette stuff is better, but I haven't seen the numbers yet. When we see the Spec2k numbers for a system with a decent RDRAM implementation vs. one with SDRAM, then we'll have our answers.
>>>2. RAMBUS is and always will be more expensive to produce than DDR or other types of SDRAM. This is why RAMBUS is trying to slap the "RAMBUS tax" on all competing memory, to artificially raise prices, and gouge the customer.
Why is RDRAM always going to be more expensive than SDRAM or DDR? It's the exact same fabrication technology. Rambus needs a little extra die area, which accounts for the expense difference. As the memories get larger, this die space will become a smaller percentage of the overall die. The only thing that is really more expensive to make (once the market matures) is the hardware, traces, and connectors needed to operate at 800Mhz+ data speeds--not the RAM itself.
As for the rest of your stuff--most of it is probably right. I don't know how good Rambus's patent claims are--clearly they aren't frivoluous, but I still think it is a crummy way to do business. I'm not going to defend their business practices, because I think they are pretty low--but the technology is sound.
They do some neat stuff. It's no Alpha or anything, but I wouldn't call it the worst-designed intel chip ever (that is a very difficult standard to meet). Yes, their marketers had a field day naming everything in the cache as if Intel had invented ANY of it--but it will probably run okay when it is running at 2 Ghz or so.
Ah, but you've got to take into account the width of the bus. The wider the bus, the harder it is to build all those big thick traces and the harder it is to do all that impedence matching, etc. That's kinda the point with Rambus--it is easier to build a serial bus (thinner) 8-16 bit interface that runs at high frequencies than it is to build a wide, 64-256bit bus that runs at medium-high frequencies.
My proof is that there are 800Mhz (1066Mhz in the lab) Rambus parts, right now! Available to be bought. DDR SDRAM is at, what?, 133Mhz? with 200Mhz someday?
Yes, it will be a little easier to build a motherboard for 200Mhz DDR SDRAM than for 1000Mhz Rambus signals, but clearly it isn't THAT much easier, because I don't see the motherboards or the memory out there yet. And keep in mind that even at 200Mhz, DDR SDRAM will have much less bandwidth per pin that a 1000Mhz Rambus setup.
> "The [original Intel 820
> chipset] issues were not defects within the MTH.
> The issues were with the Rambus channel itself
> and the use of large packages at channel speeds.
> Technically, the problem has been with >microwave-like resonance effects in the component > packages, connectors
> and in the structures formed by these when > placed on printed circuit boards."
All that says is that Intel was having a hard time designing for the low-impedence, high frequency environment necessary to make Rambus work. When DDR SDRAM grows up to higher clock frequencies (such that it can actually have bandwidth per pin comparable to RDRAM) it will have the EXACT same problems. This just means that the cost-sensitive PC world isn't ready for Rambus signalling, not that the technology itself is flawed.
> Also, here is the Tom's Hardware Guide article:
> The Rambus Zombie Versus the Wounded Chipzilla.
> Also, the benchmark ; which shows the lower
> performance figures under Rambus.
All that stupid document was looking at was Intel's pathetic RDRAM implementation where that only used a two RDRAM channels. The front side bus is actually the bandwidth limiter on that implementation, not the memory, so of course you aren't going to see Rambus in all its glory. I would suggest that you wait until Alpha's EV7 comes out before you pass judgement on RDRAM.
RDRAM is an innovative technology--it is just too expensive and difficult to design for the mass-pc market. I defy you to tell me what is the flaw in Rambus technology.
Now Rambus as a company, sure I don't like them anymore than the next guy--but that is just because I disagree with their IP/litigation business model.
...then use your SDRAM and be happy. If you want real performance, then you have to acknowledge the possibility that there are other computers besides Intel or AMD boxes. If you would put down Quake for long enough to learn how Rambus works and why it is superior for high-end machines, then you would know why it was chosen by companies like Alpha for their high-performance processors. Your little toy benchmarks don't matter to the part of the world that is buying Rambus right now.
I'm not saying that PC'ers should buy RDRAM (yet), but don't trash a technology that you don't understand.
Rambus is able to run the bus much faster because they use a serial protocol, along with an interesting clocking scheme. They have lower pin counts than a parellel bus with equivalent bandwidth, but then you can have multiple rambus channels for an equivalent pin count and much higher bandwidth. Yes, there are signifigant pcb problems for such a tight bus, which is why the PC world isi probably not ready for rambus yet.
Alpha has been shrinking EV6's, so I guess power has gone down--but clock speeds are about to go up big time, so power will go up, as well. I'm sure EV7 will be quite power-hungry. Alpha's do seem to have better power-scalability than do Intel processors, not really sure why. Irregardless, Alpha engineers and thoroughly uninterested in saving power, so I wouldn't expect the current trend of lower power Alpha's (if there is one) to continue.
the limitations are not that big a deal. Yes, RDRAM will always be worse than SDRAM latency-wise, but not so much so that performance is compromised. If it were, then Alpha wouldn't do it because they aren't beholden to any RAM standard, and they have no designs on 'controlling' the memory industry, which people claim is Intel's reason for backing Rambus.
in a distributed shared memory machine--where cache coherence is maintained with directories and dedicated inter-processor links, not with a snooping bus. There's probably a way to make it work in a snooping bus SMP system, it would be difficult, though. Very good question, Ben.
Yes, RDRAM transmits 16 bits at a time, but if you take 4 RDRAM channels, then you have 64 bits at a time, with 4x1.6GB/s=6.4GB/s of bandwidth, all with the same pin count as a system with a single 64 bit DDR-DRAM system (which will supply only around 2.1 GB/sec). Therefore, for the same pin count, RDRAM will give you 3 times as much bandwidth.
As for latency, Intel did a poor implementation in which the memory controller was in the chip-set, and communicated with the processor over a standard bus. A better idea is to do what Alpha is doing, and place the RDRAM controller on-chip, which reduces the latency signifigantly.
Asympotically, there is no comparison. Processors will get more and more bandwidth hungry, and RDRAM will always supply much more bandwidth than SDRAM for an equivalent pin count (because of the pipelined--optimized data path with synced clock). Latency will be corrected by better implementations (although it will never rival SDRAM). Latency to memory is not quite as important as everyone here would have you believe, and it will become less so as caches get larger and better.
Because I'm not aware of it. Alpha EV7, due out next year, will have a low-latency, very high bandwidth (12.8GB/sec) RDRAM implementation. The engineers designing that processor don't seem to be aware of the fundamental limitation that you claim.
Of course, you probably meant the EV8 in particular, not the Alpha in general--to which I reply that it should be able to run the same Linux that runs on EV6's, although there might be some changes to fully take advantage of SMT.
http://www.alphalinux.org/
One is to create the absolute fastest processor for running a single thread--and this requires many-way issue and lots of functional units to capture all the instruction level parellelism. The other is to be able to maximize the throuhput of all the threads running on the system (i.e. capture all thread-level parellelism).
The first goal would lead you to create a high clock rate, highly out-of-order speculative cpu. The latter goal would lead you to do chip multi-processing with smaller, simpler cores. The compromise is to do SMT.
Once you have a powerful SMT processor, there is nothing to prevent you from putting multiple of these on a die in future generations (i.e. SMT and CMP are not mutually exclusive).
I'm so frikin sick of hearing about how the Transmeta CPU can morph into any architecture on the planet. This is just patently false. Even if you were just joking, there are many out there that believe this crap.
A dual channel DDR has 4x the number of data pins and traces as a dual channel Rambus. Why is everyone so unwilling to compare apples with apples? Try 4-8 channels of RDRAM vs. 2 channels of DDR, and see how your math comes out. God, this forum is ridiculous.
>>>1. No matter HOW they try to spin it with Perot-Esque charts, current RAMBUS designs will never have the potential speed of DDR, and probably never will. Pumping data really really fast 16-bits at a time just simply can't compete with pumping data not quite as fast (yet) 64-bits at a time. RAMBUS is like figuring out a way to run a `286 really really REALLY insanely fast, but it's still only 16-bits.
That's ridiculous. With 16 data pins, PC800 RDRAM produces (800Mhz*2bytes=) 1.6GB/s of bandwidth. At 133Mhz, with 64 data pins, SDRAM produces (133Mhz*8bytes=) 1.064GB/s of bandwidth. Now, if you give both the same number of data pins (64), then you have 4 channels of RDRAM for 6.4GB/s compared to one bus of SDRAM for 1.064GB/s. If you look at all the pins necessary to make the interfaces work, and give the same amount of pins to both, then you are still looking at a 3:1 ratio of bandwidth for RDRAM vs. SDRAM.
>>>Because of these limitations, RAMBUS has been proven time and again to be slower than even CURRENT PC-133 SDRAM. Even Intel unwittingly published proof of this. DDR vs RDRAM shouldn't even be a contest.
Where has this been proven 'time and time again'? All you've seen is Intel's butchering of Rambus's protocol. They used two channels of RDRAM, but they piped both of them through a front side bus that had lower bandwidth than a single Rambus channel. Please don't measure the technology based on Intel's crappy implementation. Maybe their Willamette stuff is better, but I haven't seen the numbers yet. When we see the Spec2k numbers for a system with a decent RDRAM implementation vs. one with SDRAM, then we'll have our answers.
>>>2. RAMBUS is and always will be more expensive to produce than DDR or other types of SDRAM. This is why RAMBUS is trying to slap the "RAMBUS tax" on all competing memory, to artificially raise prices, and gouge the customer.
Why is RDRAM always going to be more expensive than SDRAM or DDR? It's the exact same fabrication technology. Rambus needs a little extra die area, which accounts for the expense difference. As the memories get larger, this die space will become a smaller percentage of the overall die. The only thing that is really more expensive to make (once the market matures) is the hardware, traces, and connectors needed to operate at 800Mhz+ data speeds--not the RAM itself.
As for the rest of your stuff--most of it is probably right. I don't know how good Rambus's patent claims are--clearly they aren't frivoluous, but I still think it is a crummy way to do business. I'm not going to defend their business practices, because I think they are pretty low--but the technology is sound.
They do some neat stuff. It's no Alpha or anything, but I wouldn't call it the worst-designed intel chip ever (that is a very difficult standard to meet). Yes, their marketers had a field day naming everything in the cache as if Intel had invented ANY of it--but it will probably run okay when it is running at 2 Ghz or so.
Ah, but you've got to take into account the width of the bus. The wider the bus, the harder it is to build all those big thick traces and the harder it is to do all that impedence matching, etc. That's kinda the point with Rambus--it is easier to build a serial bus (thinner) 8-16 bit interface that runs at high frequencies than it is to build a wide, 64-256bit bus that runs at medium-high frequencies.
My proof is that there are 800Mhz (1066Mhz in the lab) Rambus parts, right now! Available to be bought. DDR SDRAM is at, what?, 133Mhz? with 200Mhz someday?
Yes, it will be a little easier to build a motherboard for 200Mhz DDR SDRAM than for 1000Mhz Rambus signals, but clearly it isn't THAT much easier, because I don't see the motherboards or the memory out there yet. And keep in mind that even at 200Mhz, DDR SDRAM will have much less bandwidth per pin that a 1000Mhz Rambus setup.
> "The [original Intel 820
> chipset] issues were not defects within the MTH.
> The issues were with the Rambus channel itself
> and the use of large packages at channel speeds.
> Technically, the problem has been with
>microwave-like resonance effects in the component
> packages, connectors
> and in the structures formed by these when
> placed on printed circuit boards."
All that says is that Intel was having a hard time designing for the low-impedence, high frequency environment necessary to make Rambus work. When DDR SDRAM grows up to higher clock frequencies (such that it can actually have bandwidth per pin comparable to RDRAM) it will have the EXACT same problems. This just means that the cost-sensitive PC world isn't ready for Rambus signalling, not that the technology itself is flawed.
> Also, here is the Tom's Hardware Guide article:
> The Rambus Zombie Versus the Wounded Chipzilla.
> Also, the benchmark ; which shows the lower
> performance figures under Rambus.
All that stupid document was looking at was Intel's pathetic RDRAM implementation where that only used a two RDRAM channels. The front side bus is actually the bandwidth limiter on that implementation, not the memory, so of course you aren't going to see Rambus in all its glory. I would suggest that you wait until Alpha's EV7 comes out before you pass judgement on RDRAM.
RDRAM is an innovative technology--it is just too expensive and difficult to design for the mass-pc market. I defy you to tell me what is the flaw in Rambus technology.
Now Rambus as a company, sure I don't like them anymore than the next guy--but that is just because I disagree with their IP/litigation business model.
...then use your SDRAM and be happy. If you want real performance, then you have to acknowledge the possibility that there are other computers besides Intel or AMD boxes. If you would put down Quake for long enough to learn how Rambus works and why it is superior for high-end machines, then you would know why it was chosen by companies like Alpha for their high-performance processors. Your little toy benchmarks don't matter to the part of the world that is buying Rambus right now.
I'm not saying that PC'ers should buy RDRAM (yet), but don't trash a technology that you don't understand.
Rambus is able to run the bus much faster because they use a serial protocol, along with an interesting clocking scheme. They have lower pin counts than a parellel bus with equivalent bandwidth, but then you can have multiple rambus channels for an equivalent pin count and much higher bandwidth. Yes, there are signifigant pcb problems for such a tight bus, which is why the PC world isi probably not ready for rambus yet.
I would still run a dual EV6 against a dual Athlon any day.
comparable to (682*4) Alphas? Why would you want to use Intel processors in a supercomputer?
...I think they use Compaq Tandem systems. When there's that much money at stake, you don't rely on Microsoft/Intel to make your software/processors.
Alpha has been shrinking EV6's, so I guess power has gone down--but clock speeds are about to go up big time, so power will go up, as well. I'm sure EV7 will be quite power-hungry. Alpha's do seem to have better power-scalability than do Intel processors, not really sure why. Irregardless, Alpha engineers and thoroughly uninterested in saving power, so I wouldn't expect the current trend of lower power Alpha's (if there is one) to continue.
Just underclock it to 750Mhz and you can use your same old heat sink/power supply/ and case. Would that make you happy?
Do you really think that engineers enjoy putting huge heat sinks on processors for no reason at all?
Did it ever occur to you that 1.4ghz processors with huge die sizes might be more difficult to cool?
But it's really nothing more than a random sentence generator that selects words like 'beowolf', 'open-source', 'gpl', 'm$', 'ipaq', 'linux', etc.
Still, the submissions get accepted about 35% of the time.
if they thought it would help their bottom-line. The engineers there are great, but the company is run by managers and lawyers.
a deal.
|C|o |m|p|a|q| sells 12,000 Processor EV68 machine to DOD
the limitations are not that big a deal. Yes, RDRAM will always be worse than SDRAM latency-wise, but not so much so that performance is compromised. If it were, then Alpha wouldn't do it because they aren't beholden to any RAM standard, and they have no designs on 'controlling' the memory industry, which people claim is Intel's reason for backing Rambus.
in a distributed shared memory machine--where cache coherence is maintained with directories and dedicated inter-processor links, not with a snooping bus. There's probably a way to make it work in a snooping bus SMP system, it would be difficult, though. Very good question, Ben.
You can easily saturate a RDRAM bus with a single processor if you are streaming data in (which is common in media happy applications).
Yes, RDRAM transmits 16 bits at a time, but if you take 4 RDRAM channels, then you have 64 bits at a time, with 4x1.6GB/s=6.4GB/s of bandwidth, all with the same pin count as a system with a single 64 bit DDR-DRAM system (which will supply only around 2.1 GB/sec). Therefore, for the same pin count, RDRAM will give you 3 times as much bandwidth.
As for latency, Intel did a poor implementation in which the memory controller was in the chip-set, and communicated with the processor over a standard bus. A better idea is to do what Alpha is doing, and place the RDRAM controller on-chip, which reduces the latency signifigantly.
Asympotically, there is no comparison. Processors will get more and more bandwidth hungry, and RDRAM will always supply much more bandwidth than SDRAM for an equivalent pin count (because of the pipelined--optimized data path with synced clock). Latency will be corrected by better implementations (although it will never rival SDRAM). Latency to memory is not quite as important as everyone here would have you believe, and it will become less so as caches get larger and better.
Because I'm not aware of it. Alpha EV7, due out next year, will have a low-latency, very high bandwidth (12.8GB/sec) RDRAM implementation. The engineers designing that processor don't seem to be aware of the fundamental limitation that you claim.