Xeons, Opterons Compared in Power Efficiency

Way to put the conclusion in the article summary by 3dWarlord · 2006-12-15 02:59 · Score: 0, Troll

I know this is slashdot, but maybe I wanted to RTFA?

AMD needs to get back in the game, quick by Salvance · 2006-12-15 02:59 · Score: 4, Insightful

AMD needs to deliver some real quad core chips (or 8 core chips) that will beat Intel's performance. If they don't soon, AMD will quickly get kicked back to the 2nd rate Intel cloner that everyone knew them prior to their groundbreaking AMD 64s and dual core chips briefly took the performance lead from Intel. I'm keeping my fingers crossed that AMD will deliver, I've always liked (and bought) their chips as long as the performance is similar to Intel.

--
Crack - Free with every butt and set of boobs

Re:AMD needs to get back in the game, quick by Anonymous Coward · 2006-12-15 04:48 · Score: 0

The only reason this happened at all was because of that POS known as Prescott and all the chips based on it. Intel flat out dropped the ball with that stinker. Otherwise Intel would have stayed on top. Especially since it has long been easier to get better motherboards for Intel CPUs rather than AMD (system chipsets really; USB, drive chipsets, etc.). Motherboards that support AMD CPUs tend to be pretty crappy if you ask me (I'm typing this on a nice Opteron CPU'd but piece of shit DK8N motherboard).
Re:AMD needs to get back in the game, quick by aminorex · 2006-12-15 05:13 · Score: 2, Insightful

Evidently you didn't read the review. Intel has serious problems for large scale computing. It does not scale up. It's fine as a thread engine for processing small transactions, but for the kind of problems that people like Google and NCAR are doing -- and it is people like that who drive some very large CPU buys -- the external MMU bites their ass every time. Is the current generation of Opterons a gamer buy? No. AMD probably won't dominate the gamer market until a high-end GPU is integrated on die at 45 nm. Meanwhile, it will eat up Intels server share as the roadmap materializes for quad core. People who buy systems with upgradability in mind are the only real market right now.

--
-I like my women like I like my tea: green-
Re:AMD needs to get back in the game, quick by Anonymous Coward · 2006-12-15 05:57 · Score: 1, Interesting

What I don't get is why are people constantly comparing Intel's quad-core with AMD's revision F hardware? Is this a marketing ploy from Intel to try and make it seem like AMD isn't in the game anymore? Revision F hardware from AMD means that the hardware is *capapble* of taking the future quad core chips, but AMD has not released them yet. In all fairness, what has Intel done for the public without AMD (or Sun for that matter) coming out with their own multi-core competitive product that saved us so much power? Let's be real here - if AMD and Sun didn't come out with multi-core, would we really be where we are today, or would we be cooking our pizzas over the already-overclocked and overheated intel chips that they were trying to squeeze so much out of? Why did it take competitive vendors to make Intel really re-think their methodology? Because they wanted to make more $$ off an old technology. I am in the least bit impressed with Intel. Of course they released a quad-core to market faster than AMD - they've been doing it for years. But again, all they did was take an idea which was not their own and go to market faster. What they don't get is AMD is working up another idea that will blow away the quad-cores. But of course, once Intel sees what they're doing and figures it out, they'll just be faster to market again... Let's give it some time and give AMD and Sun the credit they deserve for changing the future of the processor market today.
Re:AMD needs to get back in the game, quick by timeOday · 2006-12-15 08:09 · Score: 1

Which benchmark are you talking about? It looks to me like the 8 core Xeon slaughtered everything. It even won for least power to complete a job, while doing it in half the time. The Opteron only gained the upper hand in the "power at idle" test.
Re:AMD needs to get back in the game, quick by afidel · 2006-12-15 12:07 · Score: 1

Power at idle is VERY important for many datacenters because unless you are running VMWARE ESX with a very tight farm the majority of your servers probably spend a fair percentage of their day idle or nearly so. Personally I am using Dual Core Opterons for many of my n-tier boxes with Xeon's being used for the Citrix farm, each is the platform best suited for the type of work it is doing and the typical power profile. I think IT that simply plops one solution in for all needs is not taking best advantage of the available technology. My environment has grown over 50% in the last 6 months yet our power usage has only grown 30%, that's smart use of the available technology =)

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

AMD's path by homey+of+my+owney · 2006-12-15 03:01 · Score: 4, Insightful

AMD needs to do what they have been doing - thinking independently and coming up with original solutions.

Hmm, so which better reflects real-world usage? by pla · 2006-12-15 03:05 · Score: 4, Interesting

the Opteron ended up drawing much less power at idle than the Xeons
...
the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly.

So what does this mean for people shopping for servers?

If your servers constantly tick along at nearly 100% CPU use, you might do better going with the Xeon system. If your machines basically sit idle most of the time with an occasional spike for a few seconds when it actually does something, the AMD would save you more on electricity.

Of course, this raises a third possibility - Would running a number of virtual servers on one large Xeon machine waste more energy than it saves, or give a net gain?

Re:Hmm, so which better reflects real-world usage? by archen · 2006-12-15 03:17 · Score: 3, Insightful

Although some people will pipe in with their number crunching sever stories, are there any normal usage servers that really come in at 100% CPU usage? For the 20 odd servers I run few ever run at that rate for more than 30 minutes a day or so - and usually doing backups for that matter. Other system components often keep you from reaching that target, and most 24-7 servers I've seen do most of their work during a certain period then spend the rest of their time twiddling their thumbs.
Re:Hmm, so which better reflects real-world usage? by Anonymous Coward · 2006-12-15 03:51 · Score: 2, Insightful

Any server running at that rate for more than a few short peaks a day is under capacity. Ideally, you'd like to keep them at 100% but you don't control scheduling of server demand. It's too ad-hoc. You trend then build enough excess capacity to handle projected peak loads. Of course, this depends on the level of service you want to deliver. Most server "customers" expect the server to be always as responsive as it can be, regardless of load. (expectation of IT is always 100% all the time). So server farms or clusters are built to handle the peak, which typically happens only for short durations. Being able to use machines at full capacity and still maintain enough service overhead would require excess machines that can be brought up quickly.

What if you had a mixed processors with clustering software savy enough to push jobs onto idling machines. You keep the Xeons humming along at nearly 100% and you push peak loads onto a bunch of idling Opterons? Could Xen be made to do that? Have the cluster optimize for best power efficiency at whatever load. On the scale of an enterprise, the cost savings in power over time would be significant.
Re:Hmm, so which better reflects real-world usage? by ratboy666 · 2006-12-15 04:18 · Score: 1

4 core opteron x2 vs. 8 core xeon x1

Fron the article, the idle power consumption of the 8 core xeon is ~230W. 4 core opteron us ~120W.

Which means, at idle, the single 8 way xeon is better than 2 4 way opterons. Given that the efficiency of the 8 way under load is better than the 4-way, I would think that stacking on the 8-way is better.

Of course, having two 4 way independant systems is better redundancy. On the other hand, the 8 way can be utilized to solve SMP multithread problems (without the expense of high speed interconnects).

Since redundancy is so very important, I would recommend two 4 way systems (opteron) if that meets your needs. If you need more capacity, the next step (sweet spot) would be two 8 way systems (xeon).

But it is a pretty big jump.

Ratboy

--
Just another "Cubible(sic) Joe" 2 17 3061
Re:Hmm, so which better reflects real-world usage? by thisissilly · 2006-12-15 04:32 · Score: 1

are there any normal usage servers that really come in at 100% CPU usage?
The anti-spam filters at my place of employment (two machines, each with a single 2.6GHz Xeon). That's why we are replacing them with two machines, each using two dual-core Xeons, for 4x the CPU power.
Re:Hmm, so which better reflects real-world usage? by ptbarnett · 2006-12-15 04:36 · Score: 2, Interesting

Although some people will pipe in with their number crunching sever stories, are there any normal usage servers that really come in at 100% CPU usage?
For capacity planning purposes, most of my clients target 40-50% CPU utilization on servers. If it starts creeping above 60% on a consistent basis (or is forecasted to do so soon), they begin the acquisition process to either upgrade or add servers.
Queuing theory (M/M/1) shows that while the average response time doesn't increase that much, the standard deviation increases rapidly as utilization grows above 60%. Restated in simpler terms: a larger proportion of response times become significantly larger -- to the point that users start to notice and either complain or go elsewhere.
Other system components often keep you from reaching that target,
Yes, system overhead starts to increase rapidly on most systems as you approach 100% CPU utilization. In many cases, total throughput actually decreases above system utilization of about 85-90%.
most 24-7 servers I've seen do most of their work during a certain period then spend the rest of their time twiddling their thumbs.
I've looked at usage patterns for a number of systems. Whether they are public (online banking) or internal-use-only, they all seem to have the same pattern: usage peaks about 10:00 AM, with a smaller secondary peak about 1:30 PM. The second peak usually disappears on Friday afternoon.
Re:Hmm, so which better reflects real-world usage? by twiddlingbits · 2006-12-15 04:42 · Score: 2, Interesting

If I'm do General Purpose computing I would trade the 10W difference in power consumption for the redundancy and flexibility of the 4-way Opteron. With two 4 way boxes you can use one as the failover for the other, or load balance between them keeping low CPU use on each. General purpose computing really doesn't need the power of an 8-way SMP solution even with 1000's of users. You can virtualize either the 4 way or the 8 way with VMWare or Zen or Solaris Containers so that (IMHO) is a wash.

It's really back to the old Horizontal vs Vertical scaling argument which involves a lot of factors along with power consumption. If floor space in your data center is a premium you probably want the 8ways as you can double your server density per rack (assumes you have the power and cooling). If your servers idle most of the time, space is not an issue and you are at close margins on data center power and cooling the Opteron 4way might be a better choice. There are also cost differences to consider. Opterons are usally priced below Xeons so if the botton line hardware costs are important that pushes to Opterons. You also have to look at the number of HBAs and network connections a 4way and an 8way will support. There are SO many combinations to consider including how much IT growth will occur it is mind boggling! It all depends on the strategic and tactical decisions made by the Data Center Team and the IT Organization, some places are all about performance and some are all about cost and others try to get a knife edge balance. Also keep in mind what you buy today is probably obsolete in 18 months and likely will be replaced in 36-48 months.

There is also a 3rd Option. If you don't mind running on Sun SPARC equipment then the SPARC T1 based severs blow both options out of the water in terms of power consumption (just don't do a lot of floating point..they suck at that). If you are running Java and other products that have SPARC and Solaris 10 (Linux soon) versions then changing to a SPARC architecture might get some really big gains. However if you are a .NET shop or a Windows server shop you are stuck with the X86 Architecture with Xeon or Opteron.
Re:Hmm, so which better reflects real-world usage? by rbanffy · 2006-12-15 04:47 · Score: 4, Insightful

Well... If you have a couple servers that idle most of the time, I suggest that, instead of AMD, you buy VMWare.

Or go Xen, OpenVz or whatever does the trick.

But, most important, get rid of the idling boxes.

--
http://www.dieblinkenlights.com
Re:Hmm, so which better reflects real-world usage? by BDPrime · 2006-12-15 07:09 · Score: 1

It seems unfortunate that The Tech Report is the one that has to step up and measure energy efficiency. OK, so AMD is more efficient at idle and Xeon is more efficient at 100%. Who ever really runs at either of those levels? What about 10%, 20%, 30%, etc. Those are real-life utilization rates. SPEC is looking into doing something. So is the EPA. Maybe they can get together and figure it out.
Re:Hmm, so which better reflects real-world usage? by Amouth · 2006-12-15 08:50 · Score: 1

i am not sure what it would be for this but i do know that i am currently running 5 virtual servers on a dual p4 2.4 xeon box - load is around 25-40 % average with up to 80 when it gets hammered.. but no service issues - the only bottle neck is disk i/o - we are currently adding/upgrading raid controlers on it to give the virtual servers better disk access but over all it works well - i was able to pull 3 servers off the rack and virtualize them..

the dual xeon consumes ~280 watts constant and each of the the 3 that where pulled used ~180 so yea we save power.. but it will take more than a life time to make up for the hardware cost - but that isn't why we did it.. it was mainly for the reliability and portability of virtual servers.. that and wanting to use everything to it's fullest extent

--
'...if only "Jumping to a Conclusion" was an event in the Olympics.'

This just in! by gentimjs · 2006-12-15 03:10 · Score: 4, Insightful

Apples compared to Oranges: Our findings on the page after the banner adds!
.. nothing to see here, move along...

Re:This just in! by mako1138 · 2006-12-15 13:16 · Score: 1

Sure there's banner ads, but do you go to hardware sites much? The Tech Report is one of the last honest places on the web, IMO.

Conclusions converted to $$$ by ben+there... · 2006-12-15 03:14 · Score: 2, Interesting

"The eight-core Xeon 5355 system managed to render our multithreaded POV-Ray test scene using the least total energy, even though its peak power consumption was rather high, because it finished the job in about half the time that the four-way systems did. Similarly, the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly."

Presumably, the article tests power consumption because businesses are concerned with how much running each of these systems will cost them. If the Xeons managed to win in power consumption because they completed the task in half the time, that has other cost-saving benefits even beyond power consumption. They can use fewer systems to complete tasks within the deadline, complete tasks ahead of schedule (making their business slightly more agile), and/or spend less money on animators waiting for their animations to render.

Re:Conclusions converted to $$$ by FirstOne · 2006-12-15 04:01 · Score: 1

"Presumably, the article tests power consumption because businesses are concerned with how much running each of these systems will cost them. If the Xeons managed to win in power consumption because they completed the task in half the time, that has other cost-saving benefits even beyond power consumption. "

The benchmarks chosen have very little to do with the real business world.
They mostly demonstrate the effect of Intel's larger CPU caches on performance.

Choose a series of applications(processes) which accesses very large data sets (web, mail, file.. virtual servers etc) and watch the Intel CPU's begin to choke. I.E. Single process (multiple threads) benchmarks are inherently biased towards CPU's with larger unified caches.

The AMD Opteron CPU has over double the available memory bandwidth.
Which is very handy dealing with large data sets and executing real world commercial applications.
Re:Conclusions converted to $$$ by ben+there... · 2006-12-15 04:28 · Score: 1

Sounds like you're talking about server use while they tested workstation use. It looks like they called it "server/workstation" class, whatever that means.

Re:God, I'm sick of this architecture by gentimjs · 2006-12-15 03:14 · Score: 2, Funny

/me hugs his ultrasparc system
Couldnt agree more. Oh wait, something's sending an Int. Req. , cant type have; to see what it wants.....

Re:God, I'm sick of this architecture by b0s0z0ku · 2006-12-15 03:18 · Score: 1

Looks like Cell and Power are our only hope.

80x86 may be ugly, but it's cheap for the processing power and has an entrenched economy of scale. It sucks. Even Apple switched from PowerPC and is now making glorified Wintel clone boxes (though with a pretty nifty feature set).

-b.

Re:Way to put the conclusion in the article summar by Anonymous Coward · 2006-12-15 03:21 · Score: 0, Offtopic

Are you seriously claiming "spoiler" on a tech article? That's a new level of silliness.

Re:God, I'm sick of this architecture by ben+there... · 2006-12-15 03:23 · Score: 2, Interesting

Aren't newer x86 processors essentially CISC that convert the instructions down to RISC? And RISC processors, like G4/G5, that use instruction sets such as Altivec are actually using some aspects of CISC?

That was my understanding, after reading articles like this one on Ars Technica. If true, it would make fighting over CISC vs. RISC not make a lot of sense.

sophisticated approaches are required by ZahnRosen · 2006-12-15 03:25 · Score: 0, Troll

I love the thinking in this report, look at total energy consumption for a given render... Brilliant... TCO FTW!

Re:God, I'm sick of this architecture by operagost · 2006-12-15 03:33 · Score: 1, Funny

bizzaro CISC instruction set

1994 called, they want their architecture debate back.

--

Gamingmuseum.com: Give your 3D accelerator a rest.

Re:God, I'm sick of this architecture by gentimjs · 2006-12-15 03:34 · Score: 1

Its not just risc vs cisc ... the whole x86 system is based around resource-fumbling bus sybsystems. When you get down to it, the whole motto of x86 really could be "get in line, and wait" .. its 1970s era crap.

The fact that the CPU now runs at 324236GHz and can chew the math nice and fast doesnt alter the fact that the -rest- of the system (A20 gateway stuck on the KB controller and such.. ahem..) deserves to go the way of Wang...
I've always been a fan of systems like MIPS and Ultrasparc: Engineered right the first time... We'll see if cell makes it into a non-wintel-clone-type subsystem

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-15 03:36 · Score: 0

I've heard this repeatedly on Slashdot, and it never makes sense. The idea of RISC is that the responsibility for interpretation and optimization falls on the compiler. Sure, adding an extra interpretation layer on the processor will simplify the internals, but that layer is still necessary. If that interpretation was done at compile time instead of runtime, the situation would be much simpler for the processor. And *that* is RISC.

Re:God, I'm sick of this architecture by Ancil · 2006-12-15 03:36 · Score: 5, Informative

bizzaro CISC instruction set piece of shite

I guess you didn't get the memo. Turns out RISC wasn't the good idea everyone thought it would be in the 1990's.

RISC worked well when speed of memory and CPU's were at parity. The simplified instructions let the CPU be clocked a lot faster, not to mention their shallow pipelines made it less costly when branch prediction failed. The tradeoff was that it usually took more instructions to accomplish a given task.

But as CPU's have spent more and more time waiting for memory, CISC has really come into its own. Think of CISC as a compression algrorithm: An x86 instruction which fits in 16-32 bits might take 4 or 5 instructions on a RISC processor, weighing in at 96-128 bits. It's no surprise why CISC processors have destroyed RISC in the past decade.

oracle datacenter by chap_hyd · 2006-12-15 03:39 · Score: 4, Informative

one friend who works for oracle, in their datacenter, told me that they are swaping the dell intel xeon server with Sun AMD Opteron servers. the main reason behind this server swap is power efficiency of the new sun servers. So that means big corps already had their eye on AMD cpus :)

Re:oracle datacenter by speculatrix · 2006-12-15 04:53 · Score: 1
it doesn't make any sense to swap out a working and functional server running intel chips with one running AMD purely for power saving, because electricity is a relatively small of the lifetime cost of a server, until
- the server no longer has adequate spare capacity and would be upgraded
- you're beginning to overload your power or cooling grid, and its cheaper to regrade your servers (which can be deployed elsewhere) than change the power grid or fix your air-con
it's a similar problem for car users - for an average vehicle doing 25mpg, about half the energy of its lifetime of making, using, and recycling/scrap is consumed when making.. environmentally it's best to fix up an old car so it runs properly with minimal emissions than generate a lot of scrap metal & plastics and incur the environmental costs of mining/refining metals, drilling for oil for plastics, manufacture etc of a new car.
Re:oracle datacenter by Tmack · 2006-12-15 07:03 · Score: 1
it doesn't make any sense to swap out a working and functional server running intel chips with one running AMD purely for power saving, because electricity is a relatively small of the lifetime cost of a server, until
- the server no longer has adequate spare capacity and would be upgraded
- you're beginning to overload your power or cooling grid, and its cheaper to regrade your servers (which can be deployed elsewhere) than change the power grid or fix your air-con
it's a similar problem for car users - for an average vehicle doing 25mpg, about half the energy of its lifetime of making, using, and recycling/scrap is consumed when making.. environmentally it's best to fix up an old car so it runs properly with minimal emissions than generate a lot of scrap metal & plastics and incur the environmental costs of mining/refining metals, drilling for oil for plastics, manufacture etc of a new car.

Considering that Xeons have been around for years now, for all the parent stated these could be old 1Ghz or slower Xeon based servers. Rather than upgrading to the latest, they decided to switch platforms, which would meet your criteria.
However, I disagree with your statement that the cost to power a server is a small fraction of its cost. A basic server, costing about $4k (nothing fancy), running 24x7x365.25 at about 300Watts, will use 18408.6 KWH in one year. At $0.07/KWH, thats $1288.60 per year just to power the box. Data center design estimates usually state the power overhead for a server is about the same ammount it actually consumes, so that raises the cost per year to $2577.20, more than half its original hardware cost. Most servers I manage are in the rack for well over 2years, so stating that its a small portion of the lifetime cost is invalid. This does not include items such as maintenance contracts, software licenses and other similar costs because those dont really change between different platforms. Even Google recognizes this, and its the whole reason behind both AMD and SUNs newer processor lines:
Link
blah
tm
--
Support TBI Research: http://www.raisinhope.org
Re:oracle datacenter by aczisny · 2006-12-15 09:08 · Score: 2, Informative

A basic server, costing about $4k (nothing fancy), running 24x7x365.25 at about 300Watts, will use 18408.6 KWH in one year. At $0.07/KWH, thats $1288.60 per year just to power the box.

It took me forever to figure out what was wrong with this. I knew your numbers didn't add up but I couldn't put my finger on it until I realized you multiplied out exactly what people say when they mean constant uptime. The problem of course, is that it should be 300(watts)*24(hours/day)x365(days/year) or 24(hours/day)x7(days/week)x52(weeks/year) to get the power used in a year. You end up with 2628 KWH a year. At $0.07/KWH you get $183.96 which is much more reasonable. Not something I'd ignore as a business with hundreds of machines, but not a quart of the cost of the machine itself either.

As my chemistry teacher always used to tell me, UNITS! It's all about keeping proper track of your units!

--
Now, landing thrusters.. landing thrusters, hmm. Now if I were a landing thruster, which one of these would I be?
Re:oracle datacenter by afidel · 2006-12-15 12:38 · Score: 1

The problem is that just considering power is stupid. I figure power used x3 when designing because between inefficiencies, heat load from UPS's, AC, etc that's about what you end up at. So 365 days *24Hrs * 300W /1,000(WHrs/KWHr) = 2628KWhrs * 3 = 7884 * $.12/KWhr (realistic for most of the country when you include delivery charges) = $946.08/year. Then add in the amortized cost per KW of your UPS and generator and it almost doubles that figure so say $2K/year. Over the useful lifetime of the typical server it cost about as much to run as it does to purchase. Add in IT time and it cost about triple the purchase price without considering software licenses! TCO can be hard to figure out in many organizations but this is a good rough guideline for back of the envelope calculations.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Re:God, I'm sick of this architecture by $RANDOMLUSER · 2006-12-15 03:42 · Score: 1

You are correct.
It does:

push BP
mov BP,SP
sub SP, 10

and

mov SP,BP
pop BP

internally very quickly as RISC instructions. It's still 5 bus cycles.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill

Best Practices by killmenow · 2006-12-15 03:47 · Score: 5, Insightful

It has always been my understanding that best practices dictate a server running at a constant 100% CPU utilization is underpowered and needs upgraded. Normal, every day, steady CPU utilization should hover no higher than around 50% (closer to 75%, if you like living on the edge) leaving enough CPU to handle peak loads. Very few functions require a system that maintains a constant CPU utilization and never peaks over it.

Re:Best Practices by greg1104 · 2006-12-15 07:51 · Score: 1

> Normal, every day, steady CPU utilization should hover no higher than around 50% (closer to 75%, if you like living on the edge) leaving enough CPU to handle peak loads

A server that's providing services to regular users, sure. But if your server is doing computational work, like many of the scientific computing examples given in the article, it should be spending every minute of every day at 100% utilization.
Re:Best Practices by Anthony · 2006-12-16 08:07 · Score: 1

And here is an example. Sunday is a quiet day as job submission drops off, but this cluster is normally near 100% CPU capacity [APAC]. Each bar is a 32 Itanium2 CPU node.

--
Slashdot: Where nerds gather to pool their ignorance

Geeks in the future... by design.sound · 2006-12-15 03:48 · Score: 0, Offtopic

"I got a 2KW Optitron running GoogOS, you?"

"3KW Sexium on Microsoft Linux."

"Shut up and roll."

Meaningful, normalized values of watt/performance? by Phatmanotoo · 2006-12-15 03:48 · Score: 1

It's very useful to have some normalized way of measuring watts/performance, as they try to do in this article. But at least they could have used a more general and useful benchmark, like those offered by www.spec.org.

Re:God, I'm sick of this architecture by vtcodger · 2006-12-15 03:48 · Score: 1

***It's almost 2007 and we're still hanging bags on the side of the 8080. No matter how many cores, caches or pipelines, no matter the clock rate, it's still the same-old same-old single-accumulator, bizzaro CISC instruction set piece of shite.***

Please don't get the idea that I'm defending the Intel x86 instruction set. When I first saw it in the early 1980s, I thought it was the most gawdawful mess I'd seen in 25 years in the business (I wrote my first assembler code in 1960). It hasn't improved any with time. I still detest it. Thankfully, I rarely have to use it.

[My candidate for the best microcomputer instruction set from the programmer's POV -- hands down, the MC6809]

But my understanding is that (almost?) all modern CPUs in fact have some different -- often vastly different -- architecture under the hood of their x86 chips and just use the x86 set as a sort of pidgin language that they translate into real instructions that typically run in a multiple register, multiple stack, highly parallel, etc environment of some sort. Do I have that wrong?

Is it time and past time to devise a new pidgin language? Probably. But let's don't let Intel have too much influence on the process. Intel doesn't seem to know the meaning of terms like simple or straightforward. I've never encountered anything they did that wasn't overly complex and often their design decisions seem to me to be utterly baffling.

In any case, it is still necessary, to emulate the x86 because there is all that legacy code out there. That doesn't mean that the hardware itself is constrained to actually implement all the x86 wierdness -- it just has be able to act like it does.

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey

Re:God, I'm sick of this architecture by $RANDOMLUSER · 2006-12-15 03:49 · Score: 2, Insightful

What I'm really referring to here is the extreme non-orthogonality of the ISA and the register set. I'm certainly not a purist when it comes to what individual instructions are allowed to do, but there's a lot to be said for having instructions all be the same width.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill

Info Power by Doc+Ruby · 2006-12-15 03:53 · Score: 1

I'd like to see these efficiency curves plotted against 100%, the maximum theoretical efficiency of the transfer function through the semiconductors. Anyone know how to calculate the minimum W:b (watts per bit) necessary for these real-world tasks? Or is that just way too complex a stat to compute without melting the datacenter at which it's computed?

--

--
make install -not war

Re:Info Power by Anonymous Coward · 2006-12-15 09:16 · Score: 0

Look into reversible computing. Essentially, erasing information takes energy, because it lowers the entropy of the system. So in theory, memory chips would heat up while CPUs stay cold, but no current chip comes anywhere close to that limit yet.

Quote from http://www.zyvex.com/nanotech/reversible.html

When a computational system erases a bit of information, it must dissipate ln 2 x kT energy, where k is Boltzmann's constant and T is the temperature. For T = 300 Kelvins (room temperature), this is about 2.9 x 10^-21 joules. This is roughly the kinetic energy of a single air molecule at room temperature.

Today's computers erase a bit of information (in the sense used here) every time they perform a logic operation. These logic operations are therefore called "irreversible." This erasure is done very inefficiently, and much more than kT is dissipated for each bit erased.

Another factor is leakage current, because really tiny transistors use up a lot of energy in the "off" state, as well as the "on" state.

Re:Way to put the conclusion in the article summar by Doc+Ruby · 2006-12-15 03:56 · Score: 1

I know this is Slashdot, but what's stopping you from R'ingTFA? The suspense lost by the spoiler?

--

--
make install -not war

Re:God, I'm sick of this architecture by aminorex · 2006-12-15 03:59 · Score: 1

You've accepted a fallacy of false dichotomy. While the 90s posed a dilemma of RISC vs. CISC, modern hardware architectures are more akin to VLIW. The ISA may be a stack machine, much to the dismay of compiler writers everywhere, but that is flattened into a superscalar VLIW microcode stream.

--
-I like my women like I like my tea: green-

Re:God, I'm sick of this architecture by diegocgteleline.es · 2006-12-15 04:04 · Score: 1

It looks to me that the Instruction Set War (CISC vs RISC) is pretty much lost. Nobody cares about the instruction set. Microsoft is not the culprit. CISC processors just got fast, much faster than many RISC processors. These days what makes a CPU fast is what there's inside, not the instruction set.

Re:God, I'm sick of this architecture by multimediavt · 2006-12-15 04:09 · Score: 1

Actually, don't rule out "something completely different" from Intel now that Apple is a partner. Intel has been trying for more than a decade to break out of the boring beige box business that Microsoft drug them into. Sure, it's been VERY profitable up to this point, but there's a curve in the road and something must be done. I strongly believe that Intel and Apple will come up with a hardware solution that will clearly differentiate the Mac from other Intel-based products. Don't know when this might happen, but I'm buying more stock in both companies ASAP. Don't get me wrong, IBM's Power and Cell architecture are going to take some quantum leaps in the next 18 months too. The next two to five years may be very interesting for the computing world.

Re:God, I'm sick of this architecture by pz · 2006-12-15 04:12 · Score: 1

[My candidate for the best microcomputer instruction set from the programmer's POV -- hands down, the MC6809]

Amen, brother! While I haven't been coding for quite as long as you (for me, it was 1976 when I started), I've used a hefty number of instruction sets and designed a handful myself. The 6809 was always my favorite. I still have a well-worn copy of the 6800 instruction set manual in my library; so clear, so beautiful. This was back when instruction set design was based purely on merit (what is the best way to compute?) rather than market forces (what is the best way to run MS applications?).

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.

Business needs to pay attention by msobkow · 2006-12-15 04:15 · Score: 4, Insightful

I know of and have worked with too many organizations that figure it's just a matter of slapping all the computers in an air-conditioned room. Every watt of waste heat adds to the A/C bill.

Old fashioned water-cooled mainframes and big iron (for it's time) often recirculated the wasted heat into the heating systems of the surrounding buildings. We've known all along how to be more energy efficient, if companies and management would only place the emphasis on the environment in their budgets.

--
I do not fail; I succeed at finding out what does not work.

Re:Business needs to pay attention by ZorinLynx · 2006-12-15 07:39 · Score: 1

This doesn't work everywhere. Down here in Miami we have to run A/C the entire year, because it rarely gets cold enough outside for heating to even be needed, much less added to.

I'm surprised there aren't more data centers in places with really cold climates. Must be nice to use waste heat to heat the building, or just put a radiator with a fan blowing through it outside instead of having to use air conditioning.

-Z
Re:Business needs to pay attention by Umbrel · 2006-12-15 07:52 · Score: 1

Let's outsource our server farms to Alaska and Siberia, althought IT techs will not be happy with that... I guess

--
Ave Maria

Re:God, I'm sick of this architecture by Ed+Avis · 2006-12-15 04:18 · Score: 1

An x86 instruction which fits in 16-32 bits might take 4 or 5 instructions on a RISC processor,

Do you have evidence to back that up? From the limited amount that I've seen, the opposite seems to be true - one or two instructions on an ARM or MIPS processor can neatly do what takes several instructions of fumbling on an i386. Partly this is because of more registers accessible at the instruction level, and partly because of a more orthogonal instruction set.

You could compare the size of object code spat out by gcc for different architectures, though of course gcc may not be the most efficient writer of assembly language.

--
-- Ed Avis ed@membled.com

Well too bad get used to it by Sycraft-fu · 2006-12-15 04:28 · Score: 2, Interesting

It's not going anywhere. Intel actually wanted to replace it though it's arguable if their replacement was better or worse but AMD won out the 64-bit round with x86-64. That's what Linux uses, that's what Windows uses, it's a done deal.

Now personally to me you sound like someone who's spent a little too much time in a computer science architecture class soaking up theories about ISAs and too little time actually looking at how chips are made these days and what works. When you get right down to it, x86 works just fine. The chips built on it are very fast, the compilers are able to generate efficient code for it, it plain works in the real world. You may not like it, but it does work well in the real world.

Will something like the Cell kill it? Maybe, but forgive me if I'm more than a little skeptical. There's been things that are going to kill x86 for a long time and none of it has panned out. You can try and make your ISA as brilliant as you like, what it really seems to get down to is good chip design for the money, and Intel and AMD are hard to beat at that.

Power = Heat by mungtor · 2006-12-15 04:37 · Score: 2, Insightful

"If your machines basically sit idle most of the time with an occasional spike for a few seconds when it actually does something, the AMD would save you more on electricity."

More importantly, I think, is that power consumption translates to heat output. If you have mostly idle servers with occasional spikes, you can either cool them for less or put more in the same space depending on what you need. And don't forget that you actually save money twice with the AMD since you have to pay to power and cool the Xeons.

Virtualization, if done correctly, should save you more money on hardware than anything else. You load up a Xeon machine with 6 virtual servers and keep it humming at 70% load. Then you're probably putting out less heat than 5 lightly loaded AMD processors. You've saved the money on the extra hardware, and gained a lot of good things about machine portability in the future.

Re:God, I'm sick of this architecture by drinkypoo · 2006-12-15 04:54 · Score: 1

How exactly is it VLIW? We're still only issuing one instruction per clock per core. The fact that it breaks down into micro-ops, some of which may be executed in parallel, still doesn't make it VLIW.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:Way to put the conclusion in the article summar by daybot · 2006-12-15 04:59 · Score: 2, Funny

>I know this is slashdot, but maybe I wanted to RTFA?

You must be new here...

Chip sets for AMD are better by Joe+The+Dragon · 2006-12-15 05:23 · Score: 1

With the intel chip set there is only 2 x8 pci-e lanes coming out of the north bridge and sas / sata-2 , pci-x, networking, as well as the pci-e slots on the board have to share them.

So with a lot of network use and disk use you can choke up that bus.

Re: Chip sets for AMD are better by Joe+The+Dragon · 2006-12-16 03:59 · Score: 1

Not to the server / workstation side
also there very few intel workstation board that can run the new xeons and have at lest one full x16 pci-e slot.

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-15 05:25 · Score: 2, Insightful

This is foolish. Variable-width instructions provide higher instruction throughput by having lower memory bandwidth requirements and consuming less cache space. You want to code your instructions so that the most-frequently used instructions are as small as possible. This has been an active area of research for tailoring ISAs to workloads, but even an ad-hoc scheme that improves those two areas in the general case is better than none at all.

This coding is more complicated than fixed-width instructions, but this complexity is less expensive than cache in power, latency, and die space. This isn't to say that x86 ISA is optimal, but it isn't bad-enough to warrant the incessant whining that people bring up every time they discuss ISAs.

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-15 05:43 · Score: 1, Insightful

Intel is not going to run custom processors for a single company that sells less computers in a year than Dell sells in a quarter. Apple does not want to pay the higher marginal costs that would be associated with such a proprietary run of processors nor the cost of moving their software and ISVs to yet another architecture. When Intel tries to "move outside" of the "boring beige box" (immediate tell-tale that your brain is made of cottage cheese, btw) it means convincing set-top and tablet manufacturers to include their desktop processors in their devices. It doesn't mean producing lots of custom chips for lower margins.

Oh really? by Ayanami+Rei · 2006-12-15 05:46 · Score: 1

So uh, this memory-mapped IO that I'm using instead of emulated PIO, and these programmable DMA controllers, and the cascading interrupt muliplexer, and this hybercube bus with cache coherency... that all is just a figment of my imagination.

Meanwhile my Sun has OH LOOK, a crossbar, and MY GOD! this newfangled PCI bus. WHAT HATH SCIENCE DONE?

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON

Test idea by Joe+The+Dragon · 2006-12-15 05:48 · Score: 1

Here is one test that needs to be done take a duel amd opteron workstation with 2 Quadro cards in sli and also put in a raid 5 sas or sata setup also do some networking at the same time. There are duel and quad amd opteron boards with nForce Professional chip sets. some have 4 pci-e slots x16 x8 x8 x16 with each half coming from a HTT link.

Also take a duel intel workstation and try to do the same thing the best that you can find is x8 x8

Use hacked sli drivers is ok.

I think that the amd system will do better as it has much better io bandwidth.

Re:Test idea by Anonymous Coward · 2006-12-15 07:30 · Score: 0

So the Opteron challenges the other Opteron to a duel. Meanwhile the Quadros are dueling. In another room the Xeons are having a duel as well.

Would that leave you with a bunch of single CPU/GPU systems?
Re:Test idea by Joe+The+Dragon · 2006-12-15 08:08 · Score: 1

No put a duel Xeon with 2 video cards and raid next to a duel amd system with the same thing. Right intel chip sets still suck and the ones being used in amd systems have more bandwidth.
Re:Test idea by Anonymous Coward · 2006-12-15 09:23 · Score: 0

Compare:
Duel
Dual

So still no review for real server utilization by Anonymous Coward · 2006-12-15 05:54 · Score: 0

And so far the conclusion is your server farm should run at 50% utilization average, make it virtual and run it on Xeons at almost 100% and keep the other 50% on iddling Opterons waiting for the peaks?

Do you code in assembly? by Ayanami+Rei · 2006-12-15 06:01 · Score: 1

No, I take it.

Then why do you care?

And we "fixed" this with x86_64. The extended instruction set allows for more orthogonal expression of what you want to do with your ops w/r/t regs and memory (although not all of them are equivalent length, the more common ones are shorter, so what does it matter?)

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON

The "MyriMatch" benchmark shows intel is slower by Serveert · 2006-12-15 06:47 · Score: 1

http://techreport.com/reviews/2006q4/xeon-vs-opter on/index.x?pg=7

Very interesting. The benchmark uses a database and is the only one I've seen that seems to test the limits of the CPU cache with a database.. and low and behold, at 8 threads, performance degrades for the 5355 and it's actually slower than the opteron 2218.

Or it could just be that this benchmark isn't coded well - it might use a global lock frequently so as you add more threads there's more contention. In any case someone with more time than me should dig into this benchmark which might show a weakness in the core 2 architecture.

Finally, good benchmarks. Where were these guys a month ago before I ordered those 5320s and when will those 5355's be available for the rest of us.

--
2 years and no mod points. Join reddit. Because openness is good.

Re:The "MyriMatch" benchmark shows intel is slower by greg1104 · 2006-12-15 07:59 · Score: 1

Or it could just be that this benchmark isn't coded well - it might use a global lock frequently so as you add more threads there's more contention. In any case someone with more time than me should dig into this benchmark which might show a weakness in the core 2 architecture.

Take a look at http://tweakers.net/reviews/661/7 if you want to see how the performance of the Clovertown Core 2 chips scales with a scalable database and many clients.

HOWTO: save 20W/socket when idle on Opteron or A64 by Splork · 2006-12-15 06:51 · Score: 4, Informative

See http://electricrain.com/greg/opteron-powersave.txt .

All AMD K8 (Opteron and Athlon 64) CPUs have the ability to run the clock and an extra slow speed when in HLT (idle) mode saving a bunch more power. Many (most?) BIOSes are not smart enough to enable this. A simple setpci command will turn it on under linux.

find out if its on:

setpci -d 1022:1103 87.b

If that returns 00, its off. To turn on clock-divide-in-hlt to div by 512 mode use:

setpci -d 1022:1103 87.b=61

(see the above URL for links to the AMD documentation on the PMM7 register; other values can work).

80x86 has the benefit of code size by DamnStupidElf · 2006-12-15 07:01 · Score: 1

Complex instructions reduce the overall code size), reducing the need for code cache and RAM. Especially with 64 bit architectures this makes a big difference. Instead of 8 byte RISC instructions, the average instruction size is probably closer to 3 or 4 bytes (not including immediate values, which of course in 80x86 can be smaller than the machine word size). Obviously RISC chips can be designed with small instruction word sizes, and for instance a pretty good RISC instruction set could live in 32 bit words, but then there are extra alignment issues to deal with. Overall, I think the idea of having a compact instruction set wins out over the simplicity of a full RISC design. Not that there aren't things I'd change with 80x86, for instance it would be nice if the next generation of x86-64 chips would support a more RISCy 64-bit mode of execution for pure 64-bit code, allowing developers (or compilers) to make the tradeoff between code size and RISC speed advantages. x86-64 already includes 8 extra registers, so perhaps having another 16 (or 48) available only from a 64-bit RISC mode could help hasten the transition to a saner instruction set.

Watt-seconds ? by ballpoint · 2006-12-15 07:21 · Score: 1

FTFA: the amount of energy used by each system to render the scene, expressed in Watt-seconds

How can you subtract a unit of time (seconds) from a unit of power (watt) ?

Assuming multiplication was intended instead of subtraction, why use Watt.seconds instead of Joule ? Still, kudos for using SI units and not something like boe.

--
Flourescent (adj): smelling like ground wheat.

Re:Watt-seconds ? by Celandine · 2006-12-15 07:54 · Score: 1

Compare `kilowatt-hour'.

Re:God, I'm sick of this architecture by jaxom_01 · 2006-12-15 08:26 · Score: 1

CISC has dominated RISC so much that 4 out of the top 5 computers in the top500.org list are RISC? (the one that isn't RISC is a Opteron Dual Core cluster) http://www.top500.org/list/2006/11/100 I had no idea that CISC was beating RISC so badly. As a side note, the 4 RISC systems in the top 5 were made by IBM. -Aaron

--
The post made with 100% recycled electrons

Re:God, I'm sick of this architecture by Chris+Burke · 2006-12-15 08:41 · Score: 2, Insightful

It's no surprise why CISC processors have destroyed RISC in the past decade.

Sorry but CISC, specifically x86 and children, has won simply by being the architecture for which most software was written. The dominance of CISC is similar to (but not the same, trying to stave off an off-topic rant) story as the dominance of Windows -- backward compatability is King.

The RISC makers knew this too. Back when RISC was the hot new thing in the early 90s, they were touting that RISC would be so much faster than CISC that you could emulate/translate x86 code and run it faster than a native x86 machine. If this had come to pass, then the reason to have, and thus the dominance of, x86 would have ended.

But it never did come to pass. CISC machines, starting with the Pentium Pro, started to translate CISC instructions into RISC micro-instructions internally, and then used all the benefits that RISC machines got with the main penalty being the complicated decoders on the front-end. Intel could push the performance of their chips, in large part by leveraging the enourmous profits of the lucrative desktop PC business, and thus kept rough parity with RISC machines, often being faster. Since the fundamental performance problem with CISC had been solved, and it still ran all the software, CISC won and RISC lost in the mainstream processor market.

Now of course there are performance pros and cons to both. While potentially reduced code size is the main advantage of CISC, I don't think it adds up to much. Especially since things like SSE2 instructions have gotten large anyway. The main advantage of RISC is the simpler decoders, and more registers. x86-64 gives more registers, plus with a fast l1 cache stack accesses aren't expensive, and the x86 makers learned a long time ago how to make good super-scalar x86 decoders. In the end the pluses and minuses don't add up to much, and it's more about the specific architectures of each chip. In this sense x86 has done a fine job of keeping performance high.

It's unfortunate from an aesthetic point of view, because x86 is an ugly beast, but in the end practicality won, and generally there's no practical reason to care any more.

--

The enemies of Democracy are

Re:God, I'm sick of this architecture by fitten · 2006-12-15 08:45 · Score: 1

Personally, I don't like x86 either. Luckily, I've never had to write x86 assembly even though I've worked on millions of lines of source (C, C++, etc.) So, aesthetics of the ISA are (no matter what I think) irrelevant because most of us will never see the ISA. I grew up learning the 6502, 6800, 68000, SPARC, and other ISAs. Those were nice to use and made x86 look like a Gorgon. I haven't written any assembly at all in over 20 years. My lowest level language has been C so that's my "ISA".

Second, the CISC/RISC debate died a long time ago. It's mostly RISC basically fell away to mean Load/Store architecture while CISC was Memory+Op.

Third, something much more interesting to think about in the world where everyone is so concerned about memory bandwidth is that x86 instructions are very much like compressed binaries. One read can get an instruction that translates into a number of instructions that are more Load/Store-like (RISC-like). On a "RISC" type machine, that equivalent instruction stream could have taken a number of reads (read: bus cycles) and multiple I-cache lines to hold (read: more memory). So, not only do you save memory size, you can save many clockcycles by reading a "compressed instruction" and translating it into the several equivalent load/store (RISC) instructions.

At least... #1 helps me forget about the ugliness that is the x86 ISA and #3 actually makes me like it a little.

Re:God, I'm sick of this architecture by cnettel · 2006-12-15 08:55 · Score: 1

Do you have any reference for your statement of only one instruction (as defined in the ISA) per clock per core? The microops are what's actually scheduled, and if we don't have any dependencies they will certainly be run in parallel. The instruction decoder on Intel core is rather wide, as well.

What About Efficiency as a Space Heater by darkonc · 2006-12-15 09:31 · Score: 2, Funny

Up here in The Great White North, there is a second important feature (mostly for desktop and deskside systems) -- and that's efficiency as a space heater. When these boxes are running at full bore, how many BTUs do they generate, and how many BTUs/watt do they generate. How many Zeons or K7s would it take to heat the average house?
More importantly, how does that compare to a dedicated space-heater?

--
Sometimes boldness is in fashion. Sometimes only the brave will be bold.

Re:What About Efficiency as a Space Heater by Cassini2 · 2006-12-15 15:05 · Score: 2, Insightful

Computers are almost 100% efficient as space heaters. Almost every watt consumed gets converted to heat.

The energy in the light radiated from the monitor or from the LEDs in the computer case is very small compared to the energy consumed by the computer. Computers do no useful physical work. The result is that almost all energy consumed by a computer is converted to heat.
Re:What About Efficiency as a Space Heater by frieko · 2006-12-15 16:36 · Score: 2, Informative

That light you mention ends up as heat too.

I love My Cyrix Processor by Anonymous Coward · 2006-12-15 09:34 · Score: 0

Cyrix man just flat out is uh.ja98u&^Y)#CN(&n q dang over heating problem againa

Why would anyone care? by msobkow · 2006-12-15 10:03 · Score: 1

The only ones affected are the tape monkeys, and their jobs were replaced by robotics years ago.

Twenty years ago satellite ground stations were dropped off up north with nothing more than a big tank of diesel, a power generator, and a fault-resilient or fault-tolerant server, left alone for months at a time.

With modern high speed networks and VPN access, it's often hard to tell the difference between being at work and remote access, other than the environment. Don't forget how much sysadmin work has been offshored to India and other regions, or how many global operations have geographically distributed locations, with staff at each covering the entire globe's sysadmin functions from different time zones.

Your theoretical idea has been possible for over 10 years.

--
I do not fail; I succeed at finding out what does not work.

Re:God, I'm sick of this architecture (WE CPUs) by dltaylor · 2006-12-15 10:03 · Score: 1

The old Western Electric (A.T. & T.) CPUs (WE31000/WE32000, as in the 3B-series computers) Huffman-coded the instruction set. More-frequently used opcodes were smaller than less-frequently used opcodes, so "instructions/memory word" was denser than typical RISC. Lots of registers, very powerful instructions. The processors did not fetch "instructions", they read cache lines from memory at the next uncached address of instructions.

Re:God, I'm sick of this architecture by afidel · 2006-12-15 12:47 · Score: 1

That's funny because in the real world where things like cache hit ratio matter it's been shown that a CISC front end with a rather inexpensive decode stage and a RISC multicore execution stage is the way to go. In that way modern AMD/Intel x86 CPU's are closer to the PPC970 then the PPC970 is to a classical RISC chip. Both CISC and RISC won, the two were married and each is used where most appropriate =)

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-15 13:57 · Score: 0

An x86 instruction which fits in 16-32 bits might take 4 or 5 instructions on a RISC processor,

Do you have evidence to back that up?

I don't have anything to back up the grandparent post, but RISC does make the code bloated.

x86 (in 32-bit protected mode, of course) uses a 5-byte instruction to load a 32-bit immediate value into one of the general-purpose registers. Pure-RISC devices like ARM and MIPS need 8 bytes to do this: a 32-bit instruction to load the upper 16 bits and a second 32-bit instruction to load the lower 16 bits. (RISC instruction sets are designed to make all instructions the same length.)

You can see the increased size of the object code if you use GCC to cross-compile the same piece of code for x86 and for ARM [*]. This bloat makes the cache memory ineffective. It's why RISC lost the war to CISC.

[*] Yes, I know about Thumb. It helps.

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-15 14:35 · Score: 0

bizzaro CISC instruction set piece of shite

I guess you didn't get the memo. Turns out RISC wasn't the good idea everyone thought it would be in the 1990's.

Indeed. Even Linus thinks the x86 is an ugly duckling.

Re:God, I'm sick of this architecture by Salamander · 2006-12-15 14:52 · Score: 3, Interesting

You're forgetting the basic formula from Hennessy and Patterson:

WorkPerSec = WorkPerInstruction * InstructionsPerCycle * CyclesPerSecond

Yes, CISC has better work per instruction, except for one glaring issue I'll get to in a moment, but - for various reasons explained throughout H&P - it loses on the other two and thus overall. That's why nobody's making new processors that are CISC internally any more; they just couldn't hit the issue widths and clock speeds are achievable with a RISC core (even if that core has a CISC ISA bolted on the front). What's missing here is that not all work is useful work. As anyone who has accidentally coded an infinite loop knows, executing lots of instructions is not necessarily a good thing. The glaring issue I mentioned earlier is that a lot of the instructions executed on a register-poor architecture like x86 are not doing useful work. Register thrashing means i-cache bandwidth is wasted fetching instructions which are then used to waste d-cache bandwidth, which more than outweighs any advantage from variable-length instructions.

So, you say, wouldn't variable-length instructions on a register-rich processor be the best of both worlds? Not so fast. A regular instruction set makes superscalar execution easier because it means that multiple instructions can be fetched literally at the same time without having to examine the first one to figure out where the second one begins and so on. It also makes deeper pipelines easier because it allows many internal activities (e.g. register allocation, hazard detection) to start after a simple pre-decode stage, in parallel with the remainder of decode. Either way, regular instruction sets allow for more parallelism - and parallelism in some form is the generally the key to CPU performance. If you're willing to give up performance by eschewing most modern processor-design techniques, which might be the case for a deeply embedded system with extreme size and/or power requirements, then variable-width instructions might still be a reasonable choice. In that case you might as well use an older architecture; there are plenty to choose from. For new processor designs, though, variable-width instructions are almost invariably a way to lose.

--
Slashdot - News for Herds. Stuff that Splatters.

timekeeping can go bad by r00t · 2006-12-15 16:30 · Score: 1

Unless your chip is very recent, the timestamp counter speeds will vary.

Unless your Linux kernel is very recent, this condition will not be detected automatically. Linux will assume that the discrepency means you are losing clock ticks.

You can try kernel parameters like clocksource=pmtmr to fix it. Good luck, you may need it...

The BIOS vendors disable this power-saving feature because there are Windows games that, like Linux, assume the timestamp counters don't vary in speed.

Re:timekeeping can go bad by Splork · 2006-12-18 20:01 · Score: 1

good thing to note... i haven't seen any problems on our systems so far but i am keeping my eyes open.

i'll check our kernel sources later to see if they include the code from the referenced lkml post or already default to not preferring the tsc for timekeeping.

Re: Chip sets for AMD are better (No they aren't) by Emetophobe · 2006-12-16 02:03 · Score: 1

With the intel chip set there is only 2 x8 pci-e lanes coming out of the north bridge and sas / sata-2 , pci-x, networking, as well as the pci-e slots on the board have to share them.

So with a lot of network use and disk use you can choke up that bus.

How did you come to the conclusion that AMD has better chipsets? I can get an nforce/crossfire/via motherboard for either AMD or Intel with pretty much identical specs. Intel has the advantage of making their own chipset, so Intel is the one that has the chipset advantage IMO.

Re:God, I'm sick of this architecture by Anonymous Coward · 2006-12-16 02:14 · Score: 0

REP MOVSD

REP STOSD

or yeah and a whole group of divide instructions, do that shit in a single cycle on an ARM RISC chip why don't you.

That said the Acorn RISC Machine is certainly the best of the lot.

Re:God, I'm sick of this architecture by aminorex · 2006-12-17 10:40 · Score: 1

It is the microcode architecture which is VLIW. In no wise is the x86_64 ISA a VLIW ISA. But the chips damn well are.

--
-I like my women like I like my tea: green-

Slashdot Mirror

Xeons, Opterons Compared in Power Efficiency

98 comments