There is no reason to get a 32-bit processor over a 64-bit one either though. As long as the price is the same, why worry?
Seriously people talk about not wanting 64-bit chips like there's some downside to it. Switching the hardware to 64-bits is essentially free. The cost in silicon is already minimal for a desktop processor with today's (130nm) production capabilities, and with the 90nm production it becomes negligible.
So, the easiest answer to the question of "Why 64-bit?" would be: Why not?
First off, the definition of a bit-width has changed somewhat from the days of 4, 8 and even 16-bit chips. Back then you were normally talking about the size of your registers and the width of your data bus. Now we're mainly talking about the number of bits you can directly address in a flat memory space (ie the size of your pointers). So the change hasn't been quite as dramatic as you suggest (though it certainly has been dramatic).
However, with 64-bit chips, we can now directly address a LOT of memory, 10^19 bits worth. We're really going to have to make some fairly major changes to our current microprocessor paradigm before we get to that point. Processors keep getting exponentially faster, but most other components in the system are only improving in speed linearly (or, at best, at a much slower exponential rate). If we keep this up, but the time we're dealing with 10^19 bits worth of data, the speed of I/O systems will be so out of whack with the speed of the processors that we won't get anything done.
There are some benchmarks available at the Spec website. You have to do a little bit of digging, but AMD has submitted a lot of results with identical systems running 32-bit and 64-bit Linux.
On average, running AMD64 code will buy you about a 5% boost in performance over running IA-32 code on the Opteron/Athlon64. Of course, some applications will see a much larger improvement (some apps are more than twice as fast), while others will actually be slower (64-bit pointers = more data to load from memory and fewer pointers that can be stored in cache). Note that this actually has just about nothing to do with 32 vs. 64-bit for most cases, but rather that AMD64 doubled the number of general purpose registers (16 vs. 8 in IA-32). Considering that the lack of registers was one of the last major downsides to the x86 instruction set (as compared to something like PowerPC, MIPS, SPARC, etc.), this was a very good idea. Of course, even 16 visible registers is rather low, most other ISAs have 32+ visible registers.
The latest and greatest Itanium2 (running at 1.5GHz and with 6MB of L3 cache) is sometimes faster than AMD's Opteron and sometimes slower when both are running natively compiled code. In generally they are pretty close. This is actually quite a coup for AMD, since the Opteron is a much smaller and cheaper chip. The fact that they're able to throw down with the big boys (Intel's Itanium2 and IBM's Power4+) is pretty darn impressive.
For WindowsXP, right now the only 64-bit version of Windows XP is for Intel's Itanium instruction set (IA-64). Microsoft does plan on producing a version of WinXP for AMD's 64-bit instruction set (AMD64, aka x86-64), but it has been delayed multiple times (typical Microsoft) and now isn't expected until mid-2004.
The larger addressable memory space is the primary advantage to using 64-bit chips. You also get 64-bit integer registers (generally speaking, though the addressable memory size and integer register size aren't always linked on all architectures, though if you have only one of the above I wouldn't really call it a 64-bit chip). For most applications, 64-bit registers don't get you anything, but there are a few situations where they do help.
As for 64-bit on the desktop, I think that now is the time to start the move. Installed memory on desktops tends to double every 18-24 months. Right now 1GB is the norm for a high-end desktop. That means that by mid to late 2005, 2GB will be the norm, and that's pretty much the max for 32-bit chips without resorting to all kinds of ugliness. Sure, you can still address 4GB of memory with a 32-bit chip, but that's your virtual address space. Having more than 2GB of memory on a 32-bit chip means that you need less virtual memory than physical memory, which is fine for a system like Linux, but not so good for BSD or Windows (due to differences in how they use virtual memory). You also start running into some issues of memory fragmentation. In short, a 64-bit processor becomes a real advantage any time you have 2GB of more memory, which as mentioned above, will likely become common place for new systems sometime in mid to late 2005.
So why move to 64-bit now if you don't need it for two years? Simple, software is much slower (and more expensive) to move to new architectures than hardware. AMD's 64-bit chips have been out for 6 months already, and they were at least 6 months late before that, yet we've only just recently started seeing the first versions of AMD64 Linux. WinXp for AMD64 still isn't year (won't be for 6+ months), and applications will take time after that. If you wait until the last minute to start shipping 64-bit chips, you won't have an operating system to use it for.
As for the Jurassic Park thing, even if I had the most powerful render farm in the world with nearly infinite resources, I still don't think that I would be making Jurassic Park. That sort of thing takes a bit more than just processing power!:>
It is a Cyrix derivative, though a few generations removes. Cyrix got bought out by National Semi a handful of years back ('98 if my memory is working correctly). After two years of mismanagement, they were mostly sold off to VIA, all except for the Geode division. NS kept on developing the Geode and selling it into set-top boxes with little to no success.
Then a couple months ago, they sold that division off to AMD. I suspect that the transfer of people, technology and whatnot is still a work-in-progress, but practically speaking, the Geode is now an AMD product.
To the best of my knowledge AMD plans on doing a fair bit of overhauling of the design, but to keep the basic philosophy (ie low-power, low-cost, heavily integrated x86 chip).
Sure there is a market for 8-bit processors now, but will there be in 10 years? 50 years? There was a market for vacuum tube-based computers after transistors made them totally obsolete, but you sure as hell don't find anyone making vacuum tube based compared anymore (actually, knowing the Slashdot community, there probably is some wierdo out there doing just that, but I digress:> ).
The market for 8-bit chips is already starting to disapear. Why? Not because they are no longer useful, but because the difference in price between an 8-bit and a 32-bit chip these days is negligible, and if you can standardize ALL of your development on a single chip, so much the better. Why bother have one $2 8-bit chip to do one task, a $2 16-bit chip to do another task and a $2 32-bit chip to do a third task when you can do all three using $2 32-bit chips.
This is the same reason why AMD plans to stop making 32-bit desktop processors in the not-too-distant future. Already the difference in price between making the 64-bit Athlon64/Opteron and a theoretical 32-bit version of the same chip is very small (5% difference in die size according to many previous AMD documents, so probably about a 1-3% difference in total cost of production). That number will fall over the next two years to the point where it's totally pointless for them to bother making 32-bit chips anymore.
32-bit chips aren't going to disapear after 2005, but sooner or later they will. Hell, eventually ALL processors as we know them today are likely to be replaced, probably by something that none of us can even guess about today.
FWIW at 130nm fab process (ie current processors), AMD figures that their 64-bit chips have roughly a 5% larger die-size than an otherwise identical 32-bit chip. When you shrink to a 90nm fab process (next year's chips) that number shrinks, and when you start switching to a 65nm fab process (at about the end of 2005) the difference is totally negligible.
Power consumption for 32 vs. 64-bit chips might be of a minor issue for embedded chips, but for AMD's desktop chips it's already a non-issue. AMD's Opteron and IBM PowerPC 970 chips (both 64-bit) already consume less power than similarly-performing Intel's Pentium4 and Xeon chips (32-bit), simply due to different designs.
And just how many 386 chips do you think AMD or Intel makes?
Simple fact of the matter is that a company only produces the previous generation of processor for so many years after the new one comes out. AMD is simply saying that they will stop producing their 32-bit AthlonXP chips around the end of 2005, or roughly two and a half years after they brough out it's successor. This seems like a pretty reasonable time-line.
No one other than the odd-ball Slashdot editors is predicting that 32-bit processors will cease to function or anything like that. If you bother to read the article, you'll see that the Slashdot tag line for this discussion is totally incorrect.
Opterons are not cheap, Itaniums are flat-out expensive. The absolute cheapest Itanium2 chips that Intel sells (their 1GHz, 1.5MB L3 cache version) cost $744, and the top-end model running at 1.5GHz with 6MB of L3 cache tips the scales at $4225. The top-end for 4P configurations (the Opteron 848) will set you back $3,199, though it really hits a nice price-point for dual-processor setups, where the Opteron 248 costs only $913.
Now, obviously the price of the total system is a different matter altogether, and unfortunately it's there isn't much info out yet on the price of 4P servers. I was able to get a price for a quad Opteron servers from Penguin computers, and at ~$23,000 for 4 x 846 Opterons and 8GB of memory the price was pretty comperable to an HP Integrity rx2600 server with 2 x 1.5GHz/6MB L3 Itanium2 chips and 2GB of memory (trying to keep other components to a minimum but as similar as possible, ie both had 15Krpm SCSI drives and redundant power supplies).
Given the relatively poor results that the current engineering department seems to be acheiving with UltraSparc processors, I wouldn't hold my breath for anything at all.
Now what would be interesting is if Sun put some of their platform development behind AMD's Opteron. While Sun's processors themselves aren't anything to write home about, they have some pretty impressive interconnect technologies. If they were to combine that with the Hypertransport links on AMD processors, they might be able to acheive some rather impressive systems.
That being said, the money seems to be in 2 and 4 processor servers these days. The real big (> 8 processor) servers seem to be fading away somewhat and probably aren't making much money after you take into account the (rather high) development costs.
Could be because the Opteron is one of the fastest chip in the world at executing Java code right now, and that's when running in IA-32 (aka 32-bit x86) mode?
Check out the results for SPEC JBB2000. On a per-processor basis, AMD's Opteron chips are second only to Intel/HP Itanium2 based systems, and the Opterons are quite a bit cheaper. Actually, when combined with the new x48 Opteron chips announced alongside the Sun deal, AMD should make up most of the current 8% difference between the two chips.
So, they get better performance than anything IBM has to offer (even the full-fledged Power4 can't match the Opteron in Java if the above test is to be believed) and a much lower price tag than what Intel is looking for. Seems like a pretty good choice if you ask me.
Actually I would tend to disagree, in a manner of speaking. Yes, high-bandwidth/low latency I/O is termendously important in real-world supercomputer applications, but Linpack doesn't always show this, Linpack can be run in parallel pretty easily. The reason why the cluster does so well in this test is very much related to the processor itself, though it's performance in other applications may end up being much more closely connected to it's I/O performance.
The real key here is that the PPC 970 has a multiply-accumulate instruction and can decode and retire two such instructions each clock cycle. Since Linpack does mainly multiplies and adds on data, this instruction is ideal and makes things go REAL fast. Notice that the Big Mac has an Rpeak value of 8 GFlops per 2GHz PPC970 processor as compared to the 4 GFlops per 2GHz Opteron processor for LLNL Lightning.
FWIW there are Infiniband clusters using Xeons and Opterons in the list. They don't really do any better, clock for clock and processor for processor, than the Myrinet clusters do in Linpack, though they probably would in many real-world supercomputer applications.
There is no one answer to this question, but there are a few things to consider.
First off, commodity processors simply have WAY larger economies of scale. NEC is just about the only company left designing actual processors for supercomputers (their SX series, as used by Earth Simulator, among others). The simple fact is that even if you sell 5,000 processor for a supercomputer, you're probably only going to sell one or two of those a year if you're REALLY lucky. For comparison, Intel sells ~130 million x86 chips a year. Simply put, there's a LOT more money going into the R&D for commodity chips.
The story is somewhat similar for high bandwidth/low latency I/O. Ethernet still isn't going to cut it except for pretty small clusters, but things like myrinet, quadrics and infiniband get you at least within the same ballpark as traditional supercomputer designs. They aren't quite there yet though, and this is still a major shortcoming of cluster design. However, there are at least two supercomputers I've seen that use a sort of hybrid design with AMD Opteron processors. AMD's Opteron has really nifty "hypertransport" point-to-point I/O connections right on-chip. A couple companies (most noteably Cray, but also a little guy called OctigaBay) are using the mostly-commodity hardware of AMD Opteron systems and a custom interconnect chips that hang directly off hypertransport links to create massively parallel processing supercomputers. This may be the way of the future for supercomputers, not exactly a cluster but still built from almost entirely commodity parts.
Next is the simple benchmark itself. LINPACK is a relatively simple benchmark that is fairly easy to run in parallel. Some supercomputer applications closely mirror this (Linpack is just solving linear equations after all, and that's the main thing that a lot of supercomputers are used for), others do not. In many cases though, I/O bandwidth and latency really become your limiting factor. Unless your data can be very easily cordoned off into little chunks that run on each node, you tend to spend all of your time waiting for data from remote machines. In the case of the Big Mac with 10GBit ethernet, you're looking at a best case scenario of getting 1/5th of the bandwidth of local memory (1.25GB/s vs. 6.4GB/s) and at more than an order of maginitude higher latency (~100ns vs ~2us; note of course that these are quick 'n dirty estimates, but the ratios are reasonably accurate). What's even worse, even on the nodes themselves the memory bandwidth is fairly low and latency rather high for most tasks (always has been for supercomputers, regardless of the arechitecture), so your taking a performance hit in an area that the system is already weak. Linpack doesn't necessarily show this weakness too much, but actual applications run on the supercomputer might. As always, YMMV.
I hope this doesn't sound like I'm just feeding you some line about I/O, but.. umm.. I/O is really where you differentiate the men from the boys in supercomputers.
FWIW, you might want to check out this PDF. It's Cray describing their new Red Storm architecture, and among other things it talks a lot about the challanges that supercomputers face.
The code to run Linpack is very well known and well optimized for damn near every processor out there. LANL might be able to optimize their interconnect system a bit to squeeze some more performance out of it, but they aren't going to get much.
The reason that the PowerPC 970 does pretty darn well on this benchmark is the nifty little multiply-accumalate instruction. Since LINPACK is essentially nothing but multiply-adds, this instruction is VERY useful for the benchmark. A fair bit of real-world supercomputer code is also mostly multiplies and adds together (think solving large matricies), so the performance numbers are not entirely out of line. That being said, a new HPC benchmark suite is in the works to replace the rather simplistic LINPACK test, but it won't be here until '05 or '06.
As for XServe G5's, heat's a non-issue. The Opteron and the PowerPC 970 (aka the G5) consume roughly the same amount of power, and dual Opterons are all over the place. Heck, there are even some 1U dual Itanium servers! Now THOSE chips run HOT! An Itanium2 pumps out up to ~115W, compared with a maximum of only about 60-70W maximum for the PPC 970 and Opteron (though I should point out that documentation for Opteron is a bit weak in this regard and the PPC 970's documentation is basically non-existant to the public, so a bit of guesswork is required).
No, memory is not the only place that bit errors can occur. However, with 4.4TB of memroy, they are likely to get roughly 1 soft memory error an hour, and that's using pretty conservative estimates. Every other type of error is likely to happen at rates an order of magnitude lower and they STILL use error detection/correction for it.
Good chip, good performance, but there are reasons why the cluster was cheap. As long as Virginia Tech is aware of their limitations, they can get by. In the case of ECC, the limitation is that you can't trust your data, or even your instruction stream for that matter.
Nothing to explain because it doesn't exist. The poster is correct, the lack of ECC is a MAJOR shortcoming of this cluster. There is no software correction as some people seem to think, the "Deja Vu" software is something TOTALLY different designed to detect hardware failures, not soft memory errors on fully functional hardware.
ECC is a planned upgrade for the Big Mac. Until it arrives they plan on just doing every calculation twice and hoping that their calculations are short enough that errors rate will be sufficiently low/run times short enough that it hurt performance too much.
What happened is that it stopped working ages ago. Spammers virtually never send through their own IPs, almost everything goes through open proxies, and unfortunately there are a few million of those on the internet. If you block an IP, all you're doing is blocking some poor sap with a broadband connection who's Windows box got 0wned by some spammer. By the time you get that IP blocked, the user has got a new IP through their DHCP server and the spammer has moved on to their next list of 1,000 compromised systems running open proxies (or possibly even a specific-spamming application).
You are correct that to filter spam effecitvely you need multiple filtering techniques, but filtering by IP isn't all that useful these days. It blocks only trivial amounts of spam and has a fairly high potential to block legitimate e-mail.
The patent was filed Dec. 16, 1999. I don't know about the rest of you lot, but I've been receiving spam for a lot longer than that, and I'm SURE that there are some spammers who sent out multiple versions of the same message to different e-mail addresses, thereby showing prior art.
That being said, I doubt that it matters much. If a spammer wanted to argue prior art, they would have to show their face in court, which would probably land them in jail for the existing warrants against them (whether for spamming or for other "business ventures" that most of these spammers seem to be involved in). On the flip side, virtually all spamming is already illegal in some way shape or form. By my rought estimates, about 90% of it is either scams and fraud attemps, sending obscene and pornographic material (some of which is illegal all on it's own) without any attempts at age verification, running illegal/unlicensed pharmacies, etc. etc. And that's without taking into account any existing anti-spam laws. Considering that law enforcement agencies haven't been able to do too much to combat the existing illegal behavior, I doubt that AT&T will be able to do much better.
Let me get this straight. The system you're considering now is an Athlon processor, running at somewhere on the order of 2.0GHz, with 128K of L1 cache and 256K or 512K of L2 cache and 2.7GB/s or 3.2GB/s of bandwidth, with the capability of using up to about 2GB of memory and the latest and greatest IDE hard drives.
But you're thinking that you can switch it for a 250MHz MIPS32-based CPU with 64K of L1 cache, zero L2 cache, something like 200MB/s of bandwidth, a maximum of maybe 256MB of memory (though apparently these boxes are unstable when using both SIMMs) and if you're lucky you MIGHT be able to use DMA on the hard drives, though that's iffy.
SUre it's a nice alternative if you've got a Cobalt Qube lying around the house and your "server" does mostly pretty trivial stuff, but if you're actually going to have to buy parts for it, there are MUCH faster, better supporter and cheaper alternatives out there!
That's not really a problem. If the GPL is determined to be unenforceable, The Canopy Group (and a few other key investors) collects their money in a quick and timely fashion and then get the hell out of dodge! By the time the lawsuits come flying in, there's nothing left but the wasted shell of a company that is SCO. Darl will probably be kept around as the fall man, but it's not like that will tarnish his reputation any more than it already was even before he got involved in this whole deal (probably part of the reason why Darl was picked to run the company, everyone already knew he was a scumbag). He'll sit by and watch the company go under over the next year or two while his fortune sits nice and tidy in some bank account. He'll probably even manage to earn a nice fat salary and bonus while the company is going down and the little investors that didn't get out soon enough lose their shirts.
This is all a fairly short-term get-rich-quick scam, any long-term product strategies for the company are more or less irrelevant.
All we can hope for is that the courts see this fraud for what it is and throw the perpetrators in jail. Sadly, this is unlikely to happen, the corporate veil will probably hide them.
The single-processor Opteron systems could be had for about $2,000 when they first came out, quite a bit cheaper than the dual-processor PowerMac G5.
The difference was the Steve Jobs Reality Distortion Field, nothing else.
Re:Apple's advertising is false and misleading
on
Apple G5 Ads Banned In UK
·
· Score: 2, Insightful
Single processor Opteron systems with AGP graphics, plain old PCI slots and ATA hard drives were out months before Apple even announced they had "the world's first 64-bit personal computer".
Calling a dual-processor computer with a PCI-X bus a "personal computer" and a single-processor system with a PCI bus a "workstation"?
The original poster is right, Apple's ads are incorrect and misleading. That being said, so are the ads of just about every other company out there.
I really liked how the dual-processor G5 with PCI-X slots was considered the first 64-bit personal computer, but the single processor AMD Opteron systems with only plain old PCI slots was a 64-bit workstation.
There is no reason to get a 32-bit processor over a 64-bit one either though. As long as the price is the same, why worry?
Seriously people talk about not wanting 64-bit chips like there's some downside to it. Switching the hardware to 64-bits is essentially free. The cost in silicon is already minimal for a desktop processor with today's (130nm) production capabilities, and with the 90nm production it becomes negligible.
So, the easiest answer to the question of "Why 64-bit?" would be: Why not?
How long? A LONG time I would guess.
First off, the definition of a bit-width has changed somewhat from the days of 4, 8 and even 16-bit chips. Back then you were normally talking about the size of your registers and the width of your data bus. Now we're mainly talking about the number of bits you can directly address in a flat memory space (ie the size of your pointers). So the change hasn't been quite as dramatic as you suggest (though it certainly has been dramatic).
However, with 64-bit chips, we can now directly address a LOT of memory, 10^19 bits worth. We're really going to have to make some fairly major changes to our current microprocessor paradigm before we get to that point. Processors keep getting exponentially faster, but most other components in the system are only improving in speed linearly (or, at best, at a much slower exponential rate). If we keep this up, but the time we're dealing with 10^19 bits worth of data, the speed of I/O systems will be so out of whack with the speed of the processors that we won't get anything done.
There are some benchmarks available at the Spec website. You have to do a little bit of digging, but AMD has submitted a lot of results with identical systems running 32-bit and 64-bit Linux.
On average, running AMD64 code will buy you about a 5% boost in performance over running IA-32 code on the Opteron/Athlon64. Of course, some applications will see a much larger improvement (some apps are more than twice as fast), while others will actually be slower (64-bit pointers = more data to load from memory and fewer pointers that can be stored in cache). Note that this actually has just about nothing to do with 32 vs. 64-bit for most cases, but rather that AMD64 doubled the number of general purpose registers (16 vs. 8 in IA-32). Considering that the lack of registers was one of the last major downsides to the x86 instruction set (as compared to something like PowerPC, MIPS, SPARC, etc.), this was a very good idea. Of course, even 16 visible registers is rather low, most other ISAs have 32+ visible registers.
The latest and greatest Itanium2 (running at 1.5GHz and with 6MB of L3 cache) is sometimes faster than AMD's Opteron and sometimes slower when both are running natively compiled code. In generally they are pretty close. This is actually quite a coup for AMD, since the Opteron is a much smaller and cheaper chip. The fact that they're able to throw down with the big boys (Intel's Itanium2 and IBM's Power4+) is pretty darn impressive.
For WindowsXP, right now the only 64-bit version of Windows XP is for Intel's Itanium instruction set (IA-64). Microsoft does plan on producing a version of WinXP for AMD's 64-bit instruction set (AMD64, aka x86-64), but it has been delayed multiple times (typical Microsoft) and now isn't expected until mid-2004.
The larger addressable memory space is the primary advantage to using 64-bit chips. You also get 64-bit integer registers (generally speaking, though the addressable memory size and integer register size aren't always linked on all architectures, though if you have only one of the above I wouldn't really call it a 64-bit chip). For most applications, 64-bit registers don't get you anything, but there are a few situations where they do help.
:>
As for 64-bit on the desktop, I think that now is the time to start the move. Installed memory on desktops tends to double every 18-24 months. Right now 1GB is the norm for a high-end desktop. That means that by mid to late 2005, 2GB will be the norm, and that's pretty much the max for 32-bit chips without resorting to all kinds of ugliness. Sure, you can still address 4GB of memory with a 32-bit chip, but that's your virtual address space. Having more than 2GB of memory on a 32-bit chip means that you need less virtual memory than physical memory, which is fine for a system like Linux, but not so good for BSD or Windows (due to differences in how they use virtual memory). You also start running into some issues of memory fragmentation. In short, a 64-bit processor becomes a real advantage any time you have 2GB of more memory, which as mentioned above, will likely become common place for new systems sometime in mid to late 2005.
So why move to 64-bit now if you don't need it for two years? Simple, software is much slower (and more expensive) to move to new architectures than hardware. AMD's 64-bit chips have been out for 6 months already, and they were at least 6 months late before that, yet we've only just recently started seeing the first versions of AMD64 Linux. WinXp for AMD64 still isn't year (won't be for 6+ months), and applications will take time after that. If you wait until the last minute to start shipping 64-bit chips, you won't have an operating system to use it for.
As for the Jurassic Park thing, even if I had the most powerful render farm in the world with nearly infinite resources, I still don't think that I would be making Jurassic Park. That sort of thing takes a bit more than just processing power!
It is a Cyrix derivative, though a few generations removes. Cyrix got bought out by National Semi a handful of years back ('98 if my memory is working correctly). After two years of mismanagement, they were mostly sold off to VIA, all except for the Geode division. NS kept on developing the Geode and selling it into set-top boxes with little to no success.
Then a couple months ago, they sold that division off to AMD. I suspect that the transfer of people, technology and whatnot is still a work-in-progress, but practically speaking, the Geode is now an AMD product.
To the best of my knowledge AMD plans on doing a fair bit of overhauling of the design, but to keep the basic philosophy (ie low-power, low-cost, heavily integrated x86 chip).
Sure there is a market for 8-bit processors now, but will there be in 10 years? 50 years? There was a market for vacuum tube-based computers after transistors made them totally obsolete, but you sure as hell don't find anyone making vacuum tube based compared anymore (actually, knowing the Slashdot community, there probably is some wierdo out there doing just that, but I digress :> ).
The market for 8-bit chips is already starting to disapear. Why? Not because they are no longer useful, but because the difference in price between an 8-bit and a 32-bit chip these days is negligible, and if you can standardize ALL of your development on a single chip, so much the better. Why bother have one $2 8-bit chip to do one task, a $2 16-bit chip to do another task and a $2 32-bit chip to do a third task when you can do all three using $2 32-bit chips.
This is the same reason why AMD plans to stop making 32-bit desktop processors in the not-too-distant future. Already the difference in price between making the 64-bit Athlon64/Opteron and a theoretical 32-bit version of the same chip is very small (5% difference in die size according to many previous AMD documents, so probably about a 1-3% difference in total cost of production). That number will fall over the next two years to the point where it's totally pointless for them to bother making 32-bit chips anymore.
32-bit chips aren't going to disapear after 2005, but sooner or later they will. Hell, eventually ALL processors as we know them today are likely to be replaced, probably by something that none of us can even guess about today.
FWIW at 130nm fab process (ie current processors), AMD figures that their 64-bit chips have roughly a 5% larger die-size than an otherwise identical 32-bit chip. When you shrink to a 90nm fab process (next year's chips) that number shrinks, and when you start switching to a 65nm fab process (at about the end of 2005) the difference is totally negligible.
Power consumption for 32 vs. 64-bit chips might be of a minor issue for embedded chips, but for AMD's desktop chips it's already a non-issue. AMD's Opteron and IBM PowerPC 970 chips (both 64-bit) already consume less power than similarly-performing Intel's Pentium4 and Xeon chips (32-bit), simply due to different designs.
And just how many 386 chips do you think AMD or Intel makes?
Simple fact of the matter is that a company only produces the previous generation of processor for so many years after the new one comes out. AMD is simply saying that they will stop producing their 32-bit AthlonXP chips around the end of 2005, or roughly two and a half years after they brough out it's successor. This seems like a pretty reasonable time-line.
No one other than the odd-ball Slashdot editors is predicting that 32-bit processors will cease to function or anything like that. If you bother to read the article, you'll see that the Slashdot tag line for this discussion is totally incorrect.
Opterons are not cheap, Itaniums are flat-out expensive. The absolute cheapest Itanium2 chips that Intel sells (their 1GHz, 1.5MB L3 cache version) cost $744, and the top-end model running at 1.5GHz with 6MB of L3 cache tips the scales at $4225. The top-end for 4P configurations (the Opteron 848) will set you back $3,199, though it really hits a nice price-point for dual-processor setups, where the Opteron 248 costs only $913.
Now, obviously the price of the total system is a different matter altogether, and unfortunately it's there isn't much info out yet on the price of 4P servers. I was able to get a price for a quad Opteron servers from Penguin computers, and at ~$23,000 for 4 x 846 Opterons and 8GB of memory the price was pretty comperable to an HP Integrity rx2600 server with 2 x 1.5GHz/6MB L3 Itanium2 chips and 2GB of memory (trying to keep other components to a minimum but as similar as possible, ie both had 15Krpm SCSI drives and redundant power supplies).
Given the relatively poor results that the current engineering department seems to be acheiving with UltraSparc processors, I wouldn't hold my breath for anything at all.
Now what would be interesting is if Sun put some of their platform development behind AMD's Opteron. While Sun's processors themselves aren't anything to write home about, they have some pretty impressive interconnect technologies. If they were to combine that with the Hypertransport links on AMD processors, they might be able to acheive some rather impressive systems.
That being said, the money seems to be in 2 and 4 processor servers these days. The real big (> 8 processor) servers seem to be fading away somewhat and probably aren't making much money after you take into account the (rather high) development costs.
Could be because the Opteron is one of the fastest chip in the world at executing Java code right now, and that's when running in IA-32 (aka 32-bit x86) mode?
Check out the results for SPEC JBB2000. On a per-processor basis, AMD's Opteron chips are second only to Intel/HP Itanium2 based systems, and the Opterons are quite a bit cheaper. Actually, when combined with the new x48 Opteron chips announced alongside the Sun deal, AMD should make up most of the current 8% difference between the two chips.
So, they get better performance than anything IBM has to offer (even the full-fledged Power4 can't match the Opteron in Java if the above test is to be believed) and a much lower price tag than what Intel is looking for. Seems like a pretty good choice if you ask me.
Actually I would tend to disagree, in a manner of speaking. Yes, high-bandwidth/low latency I/O is termendously important in real-world supercomputer applications, but Linpack doesn't always show this, Linpack can be run in parallel pretty easily. The reason why the cluster does so well in this test is very much related to the processor itself, though it's performance in other applications may end up being much more closely connected to it's I/O performance.
The real key here is that the PPC 970 has a multiply-accumulate instruction and can decode and retire two such instructions each clock cycle. Since Linpack does mainly multiplies and adds on data, this instruction is ideal and makes things go REAL fast. Notice that the Big Mac has an Rpeak value of 8 GFlops per 2GHz PPC970 processor as compared to the 4 GFlops per 2GHz Opteron processor for LLNL Lightning.
FWIW there are Infiniband clusters using Xeons and Opterons in the list. They don't really do any better, clock for clock and processor for processor, than the Myrinet clusters do in Linpack, though they probably would in many real-world supercomputer applications.
Only problem with that is that you'll probably have the cops beating down your door thinking your got some huge dope-growing operation!
"No officier, I swear it's not a hydroponics setup that is eating up 1MW of electrical power, it's just my the supercomputer in my basement!"
"Sure son. Book him boys!"
There is no one answer to this question, but there are a few things to consider.
First off, commodity processors simply have WAY larger economies of scale. NEC is just about the only company left designing actual processors for supercomputers (their SX series, as used by Earth Simulator, among others). The simple fact is that even if you sell 5,000 processor for a supercomputer, you're probably only going to sell one or two of those a year if you're REALLY lucky. For comparison, Intel sells ~130 million x86 chips a year. Simply put, there's a LOT more money going into the R&D for commodity chips.
The story is somewhat similar for high bandwidth/low latency I/O. Ethernet still isn't going to cut it except for pretty small clusters, but things like myrinet, quadrics and infiniband get you at least within the same ballpark as traditional supercomputer designs. They aren't quite there yet though, and this is still a major shortcoming of cluster design. However, there are at least two supercomputers I've seen that use a sort of hybrid design with AMD Opteron processors. AMD's Opteron has really nifty "hypertransport" point-to-point I/O connections right on-chip. A couple companies (most noteably Cray, but also a little guy called OctigaBay) are using the mostly-commodity hardware of AMD Opteron systems and a custom interconnect chips that hang directly off hypertransport links to create massively parallel processing supercomputers. This may be the way of the future for supercomputers, not exactly a cluster but still built from almost entirely commodity parts.
Next is the simple benchmark itself. LINPACK is a relatively simple benchmark that is fairly easy to run in parallel. Some supercomputer applications closely mirror this (Linpack is just solving linear equations after all, and that's the main thing that a lot of supercomputers are used for), others do not. In many cases though, I/O bandwidth and latency really become your limiting factor. Unless your data can be very easily cordoned off into little chunks that run on each node, you tend to spend all of your time waiting for data from remote machines. In the case of the Big Mac with 10GBit ethernet, you're looking at a best case scenario of getting 1/5th of the bandwidth of local memory (1.25GB/s vs. 6.4GB/s) and at more than an order of maginitude higher latency (~100ns vs ~2us; note of course that these are quick 'n dirty estimates, but the ratios are reasonably accurate). What's even worse, even on the nodes themselves the memory bandwidth is fairly low and latency rather high for most tasks (always has been for supercomputers, regardless of the arechitecture), so your taking a performance hit in an area that the system is already weak. Linpack doesn't necessarily show this weakness too much, but actual applications run on the supercomputer might. As always, YMMV.
I hope this doesn't sound like I'm just feeding you some line about I/O, but.. umm.. I/O is really where you differentiate the men from the boys in supercomputers.
FWIW, you might want to check out this PDF. It's Cray describing their new Red Storm architecture, and among other things it talks a lot about the challanges that supercomputers face.
The code to run Linpack is very well known and well optimized for damn near every processor out there. LANL might be able to optimize their interconnect system a bit to squeeze some more performance out of it, but they aren't going to get much.
The reason that the PowerPC 970 does pretty darn well on this benchmark is the nifty little multiply-accumalate instruction. Since LINPACK is essentially nothing but multiply-adds, this instruction is VERY useful for the benchmark. A fair bit of real-world supercomputer code is also mostly multiplies and adds together (think solving large matricies), so the performance numbers are not entirely out of line. That being said, a new HPC benchmark suite is in the works to replace the rather simplistic LINPACK test, but it won't be here until '05 or '06.
As for XServe G5's, heat's a non-issue. The Opteron and the PowerPC 970 (aka the G5) consume roughly the same amount of power, and dual Opterons are all over the place. Heck, there are even some 1U dual Itanium servers! Now THOSE chips run HOT! An Itanium2 pumps out up to ~115W, compared with a maximum of only about 60-70W maximum for the PPC 970 and Opteron (though I should point out that documentation for Opteron is a bit weak in this regard and the PPC 970's documentation is basically non-existant to the public, so a bit of guesswork is required).
No, memory is not the only place that bit errors can occur. However, with 4.4TB of memroy, they are likely to get roughly 1 soft memory error an hour, and that's using pretty conservative estimates. Every other type of error is likely to happen at rates an order of magnitude lower and they STILL use error detection/correction for it.
Good chip, good performance, but there are reasons why the cluster was cheap. As long as Virginia Tech is aware of their limitations, they can get by. In the case of ECC, the limitation is that you can't trust your data, or even your instruction stream for that matter.
1. Go read for yourself.
I read it myself, it doesn't exist.
2. No, you explain it to me.
Nothing to explain because it doesn't exist. The poster is correct, the lack of ECC is a MAJOR shortcoming of this cluster. There is no software correction as some people seem to think, the "Deja Vu" software is something TOTALLY different designed to detect hardware failures, not soft memory errors on fully functional hardware.
ECC is a planned upgrade for the Big Mac. Until it arrives they plan on just doing every calculation twice and hoping that their calculations are short enough that errors rate will be sufficiently low/run times short enough that it hurt performance too much.
What happened is that it stopped working ages ago. Spammers virtually never send through their own IPs, almost everything goes through open proxies, and unfortunately there are a few million of those on the internet. If you block an IP, all you're doing is blocking some poor sap with a broadband connection who's Windows box got 0wned by some spammer. By the time you get that IP blocked, the user has got a new IP through their DHCP server and the spammer has moved on to their next list of 1,000 compromised systems running open proxies (or possibly even a specific-spamming application).
You are correct that to filter spam effecitvely you need multiple filtering techniques, but filtering by IP isn't all that useful these days. It blocks only trivial amounts of spam and has a fairly high potential to block legitimate e-mail.
The patent was filed Dec. 16, 1999. I don't know about the rest of you lot, but I've been receiving spam for a lot longer than that, and I'm SURE that there are some spammers who sent out multiple versions of the same message to different e-mail addresses, thereby showing prior art.
That being said, I doubt that it matters much. If a spammer wanted to argue prior art, they would have to show their face in court, which would probably land them in jail for the existing warrants against them (whether for spamming or for other "business ventures" that most of these spammers seem to be involved in). On the flip side, virtually all spamming is already illegal in some way shape or form. By my rought estimates, about 90% of it is either scams and fraud attemps, sending obscene and pornographic material (some of which is illegal all on it's own) without any attempts at age verification, running illegal/unlicensed pharmacies, etc. etc. And that's without taking into account any existing anti-spam laws. Considering that law enforcement agencies haven't been able to do too much to combat the existing illegal behavior, I doubt that AT&T will be able to do much better.
Let me get this straight. The system you're considering now is an Athlon processor, running at somewhere on the order of 2.0GHz, with 128K of L1 cache and 256K or 512K of L2 cache and 2.7GB/s or 3.2GB/s of bandwidth, with the capability of using up to about 2GB of memory and the latest and greatest IDE hard drives.
But you're thinking that you can switch it for a 250MHz MIPS32-based CPU with 64K of L1 cache, zero L2 cache, something like 200MB/s of bandwidth, a maximum of maybe 256MB of memory (though apparently these boxes are unstable when using both SIMMs) and if you're lucky you MIGHT be able to use DMA on the hard drives, though that's iffy.
SUre it's a nice alternative if you've got a Cobalt Qube lying around the house and your "server" does mostly pretty trivial stuff, but if you're actually going to have to buy parts for it, there are MUCH faster, better supporter and cheaper alternatives out there!
That's not really a problem. If the GPL is determined to be unenforceable, The Canopy Group (and a few other key investors) collects their money in a quick and timely fashion and then get the hell out of dodge! By the time the lawsuits come flying in, there's nothing left but the wasted shell of a company that is SCO. Darl will probably be kept around as the fall man, but it's not like that will tarnish his reputation any more than it already was even before he got involved in this whole deal (probably part of the reason why Darl was picked to run the company, everyone already knew he was a scumbag). He'll sit by and watch the company go under over the next year or two while his fortune sits nice and tidy in some bank account. He'll probably even manage to earn a nice fat salary and bonus while the company is going down and the little investors that didn't get out soon enough lose their shirts.
This is all a fairly short-term get-rich-quick scam, any long-term product strategies for the company are more or less irrelevant.
All we can hope for is that the courts see this fraud for what it is and throw the perpetrators in jail. Sadly, this is unlikely to happen, the corporate veil will probably hide them.
The single-processor Opteron systems could be had for about $2,000 when they first came out, quite a bit cheaper than the dual-processor PowerMac G5.
The difference was the Steve Jobs Reality Distortion Field, nothing else.
Single processor Opteron systems with AGP graphics, plain old PCI slots and ATA hard drives were out months before Apple even announced they had "the world's first 64-bit personal computer".
Calling a dual-processor computer with a PCI-X bus a "personal computer" and a single-processor system with a PCI bus a "workstation"?
The original poster is right, Apple's ads are incorrect and misleading. That being said, so are the ads of just about every other company out there.
I really liked how the dual-processor G5 with PCI-X slots was considered the first 64-bit personal computer, but the single processor AMD Opteron systems with only plain old PCI slots was a 64-bit workstation.