AMD Takes Opteron To 2.4GHz
EconolineCrush writes "AMD has added a series of Opteron x50 processors to its workstation and server line that push the K8 core up to 2.4GHz. The Tech Report has tested the latest single and dual-processor Opterons against more than 20 other processors, including exotic Pentim 4 Extreme Edition chips, affordable Athlon 64s, and everything in between. Even if you have no interest in AMD's latest workstation chips, the review is worth checking out to see how two dozen of the fastest workstation and PC processors stack up in rendering, scientific computing, speech recognition, and even gaming tests."
From the article to save everyone the 16 pages of boring charts and graphs.. Conclusions "If I were building (or, implausibly perhaps, buying) my ultimate workstation right now, I'd want a pair of Opteron 250s beating at the heart of it. The benchmarks speak volumes. For single-processor systems, the Opteron 150 looks like the fastest x86 CPU on the planet. In a multiprocessor configuration, the Opteron 250 scales up very well, even without the benefit of an optimal memory configuration, a NUMA-aware OS, or 64-bit extensions. By contrast, Intel's dual Xeons are a little bit disappointing. They perform relatively well in CPU-bound apps like 3D rendering programs, which are also largely well optimized for SSE2. But in memory-bound applications where dual Xeons ought to do well, like video encoding, the Xeons' slow bus and RAM hold them back. One has to wonder what Intel is hoping to accomplish by saddling its workstation-class processors with older, slower technology. Even a single Pentium 4 benefits greatly from additional bus and memory bandwidth. Surely a pair of Xeons on shared bus ought to have this same advantage. Intel's apparent willingness to forego such enhancements in favor of adding ever-larger on-chip caches to the Xeon is puzzling"
Hmmm.
I have been running my Opteron 248 at 2400Mhz. Sisoft seems to equate this to a PR rating of 3900+. I have no idea how it calculates this so please take that with a measure of salt.
AEnertia
Witty, tag line goes here
I had an AMD64 chip with the heat spreader.
I went to take the heat sink off the other day, and the vacuum that formed between the heat spreader and heat sink caused the chip to get yanked right out of the closed ZIF socket when I tried to get the heat sink off.
Then, after reinstalling the chip, apparently the heat spreader has become disconnected from the core internally, because the CMOS reports rising temperature up to 120C, but even the heat spreader isn't warm if I turn the system off and get the heat sink off again.
So be very careful. It takes about 10 minutes to take the heat sink off the heat spreader if you used a coating of grease that covers the whole top of the chip, even if you used a thin coat. You have to wiggle the heat sink and gently pull up for quite a while before that vacuum is broken. It doesn't help that the heat sink design makes it impossible to see the chip or slide the heat sink to the side.
And be aware that it doesn't take a whole lot of force to yank the chip right from the ZIF, possibly damaging things in the process.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
64-bit vs. 32 bit using FreeBSD
Don't you ever read the BSD section?
-Jem
It will be interesting to see how Intel responds to these challenges - c't speculates that the future Pentiums will use the architecture they have in the Pentium M line (developed in Israel). If they're smart they'll introduce a dual core CPU based on the Pentium M architecture, if AMD is smart they'll modify their existing designs and beat intel to the punch again.
Speaking as a business user, I'd welcome an emphasis on ergonomics and environmental concerns over raw speed. I'd rather have silent systems that do not overload the air conditioning with enormous amounts of heat than screamers which spend 99.9 % of their time waiting for the user to press a key anyway.
"There are already a million monkeys on a million typewriters, and Usenet is NOTHING like Shakespeare." - Blair Houghton
Funny enough, that is exactly what Intel has planned. They will also be shooting for dual-core, and then quad-core CPUs in the next 2-3 years. On the flip side, AMD has announced that they are already capable of producing dual-core Opterons, and are simply waiting for the market demand to meet their capabilities. After all, it doesn't make much sense to introduce something now that can wait until later. It extends the life of the current line and increases the return on R&D.
http://smc.vnet.net/timings50.html is a start.
Sad times when a
Dell Precision 650, 4X3.06GHz Xeon, 512KB L2, 4GB, Win XP Pro V5.1 [35]:
is slower than a
Athlon 2800+, 512 KB cache, 333 MHz FSB, Win XP Pro
Nice troll...
Modern "x86" chips have a very advanced RISC-like architecture, x86 is only the instruction set - nothing more.
It takes a little bit of extra circuitry to translate the x86 ops instead of using a brand new ISA, but it's well worth it for the backward compatibilty IMHO.
On my dual opteron 244 (2 x 1.8 GHz), compiling a 2.4.26 kernel and its modules:
$ make bzImage && make modules
takes 96 secs.
Has anyone actually checked on the price? Take a gander over at http://www.amd.com/us-en/Processors/ProductInforma tion/0,,30_118_609,00.html?redir=CPT301 The new 150/250/850 models are $637/$851/$1514 comparatively. Compare that to the *48 models, which are still expensive.
Does AMDs increased market share herald a a new strategy from AMD? Back "in the day" we all used to love AMDs more than Intels because of the great performance/cost ratio.
I would love to have a pair of opterons, but the prices are ridiculous. I miss the old AMD...
I wish reviewers would start including a section on how much power the systems take. I'd like to replace my home server box and would like to minimze power consumption since it runs 24/7. I'd also like to replace my 'desktop' PC and would like to minimize fans because I like to listen to music on it.
The budget is a few thousand euros, not over 10 000 (this is comparable in dollars). What would the best bang-for-the-euro be? Single-Dual? Xeon-Opteron-Itanium2? It must at least contain 4 gig of RAM.
Itanium servers are out of your league. A decent 1.5Ghz Itanium chip with 3Mb of on-die cache will set you back around $3,000. Not including memory, hard disks, etc. Just for ONE chip.
Xeon are way cheaper, but in most cases are more expensive than Opterons, do not scale very well when used in 2-way or higher configurations, and can only use 4Gb in flat mode. To access above 4Gb, you need to use PAE, which greatly hampers the performance (PAE is akin to the "high-memory" window trick they used back in the DOS days).
Opterons, on the other hand, are usually cheaper than Xeons, much cheaper than Itanium, almost always have better performance that Xeons, scale much better (in fact, a 2-way server performs better than a 1-way times 2!) and are only beat by Itanium in floating point performance, and then only barely.
There's another thing. Opterons are going to become dual-core in less than 2 years, with the same pinout as today. That means that if you have a lowly 2-way server that you're thinking about dumping, you can buy new dual-core Opterons and instantly get a 4-way out of your old 2-way server. Also, Opterons can access linearly up to 1Tb of physical RAM (that's 1,024 Gb), and up to 256Tb of virtual memory. And, finally, it's the only 64-bit processor you can get today that works with all your 32-bit x86 software. Finally, Opterons consume less energy than equivalent Xeons or Itaniums, and this becomes very important when thinking about A/C, UPS, standby power generators, etc.
I'd recommend you go with Opteron. Check out some well known tier-2 vendors such as Angstrom, Appro or Verari. They all make excellent quality Opteron servers and workstations. If you want brand names (and are willing to pay for it), check out Sun, Hewlett Packard or IBM for 2-way servers, or HP for a 4-way. IBM even has a dual Opteron workstation, if that's what you want.
Good Luck,
Marcos
AnandTech usually does them in their processor reviews, lemme dig one up.
Here's one, for example.
(Of note, the Athlon FX-51 and -53 are identical to Opteron 148 and 150 processors, respectively. The Athlon 64s are similar as well, difference is they use a different socket, have only single-channel memory controllers, and use unbuffered/unregistered memory.)
Basically, the Hammers are godlike at compilation.
The lowest-rated (at the time; a 2800+ has since been released) A64 3000+ beats the fastest P4 3.4GHz Extreme Edition.
Work is punishment for failing to procrastinate effectively.
This problem does not arise however when we use 'double long' formats, or 64-bit floats, because these are way more precise and still can go a long way when 32-bit doubles already jump to zero, thus causing the problems.
On the x86 architecture, "long double" is 80-bit, and not 64-bit, which is plain "double". "float" is 32-bit.
However, note that the x86 does all floating point operations with 80-bit precision. So you don't get any performance advantage from using only single precision variables (other than lower memory bandwidth usage). Thus, a good rule of thumb is to always use double (long double might be better but isn't portable, and SSE doesn't support it if you want to use that). Single precision is mainly useful when you want to store large amounts of data (remember to cast the part of the data you're working on to double before calculating).
As others have pointed out, currently the Opteron is quite unbeatable in price/performance. 10000 EUR should certainly get you a 2 cpu system. Probably not 4 cpu:s though? Given that you need lots of memory, especially avoid the Xeon (or some other 32-bit architecture). Linux can only give 3 GB to one process with it's default configuration (I guess windows is similar?). With the so-called 4g/4g patch you can allow 4 GB for each process, but the price is lower performance. With a 64-bit architecture all those problems disappear.
...and read some of the papers on x86-64. AMD has a lot more than 16 registers *internally*. But it turned out the performance got WORSE when they were exposed to the compiler, instead of managed internally. If they can't even manage such a trivial change well, it's likely the RISC compilers would do worse than a CISC-RISC decoder stage.
If you want to make a computer performing anything close to modern standards, you're going to have to deal with interdependency of the RISC instructions anyway (pipelining, hyperthreading, multiple cores etc.) Don't you think Intel or AMD would provide a "native" interface if the decoder stage was really holding them back?
In short, I'm sure the engineers at AMD and Intel have picked apart x86 code and said "With perfect compilation to our internal structure, how much faster would it be?" and found that it simply isn't the way you describe it.
Kjella
Live today, because you never know what tomorrow brings
The Opterons have 1 MB (8 Mb) L2 cache where the G5 has .5 MB (4 Mb) L2.
At similar clockspeeds I think the performance is fairly similar, though the Opterons may do better in a dual-CPU configuration since they have on-chip memory controllers and thus more total memory bandwidth.
I'd like to see a head-to-head shootout using top compilers (an often overlooked issue) for both.
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
He was addressing the compilation test specifically, which is the one he linked to and the one where the slowest Athlon does indeed beat out the fastest P4.
So, yes, I would say his post was informative.
> No they don't. Ever compared a compiler outputted innerloop with a hand coded one?
Yea, they do, and yea, I have. Absolute Perfection(tm) of an instruction rarely performance of a system makes. Compilers may not be perfect, but they're usually well beyond good enough. Good compilers are have been really quite good at register allocation for some quite time.
> Which RISC needs more than that?
Those that need to accomplish the various options offered. Ever compared assembly instructions sets by using them in actual practice? I have. In many cases you use 1 Risc or 1 Cisc. Often you need 1-5 risc to 1 cisc. Rarely, if ever, do you need 2+ cisc to perform a risc. Ever wonder why rule of thumb for sizing a RISC system generally requires twice the memory as a CISC (for similar loading)?
Perhaps the CPU will, ultimately, be left accessing that memory? Would that imply a possible rough doubling (on average) of CPU-Memory bandwidth?
Yea, it does.
> I'm sure Intel knows very well that they could perform better if they broke x86 compatibility.
Why would they know this? (Seeing how it's just not true.) There have been, and are now, lots of processors - and exactly NONE have flat out outcompeted the x86. We have fractional advantages here, or there, but NOTHING out there that HAS "broke x86 compatability" has screamingly outperformed it.
Linux supports LOTS of CPUs. And, x86 still pretty much sits at the top of the performance heap.