be-fan · Slashdot Mirror

Re:Good point. Unfortunately ... on IBM to use Cell in Blade Servers · 2006-02-09 16:13 · Score: 1

Actually, 10th the FLOPS. Which makes Cell very unimpressive compared to a dual core Opteron for real scientifi computations.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-09 15:26 · Score: 1

Cache and memory latencies are significantly aggravated by a long pipeline, which is why the G5 still gets fewer IPC than the G4 despite the G5's truly impressive design.

No they aren't. Yonah has very low latency to L1 and L2 cache, despite having a moderately long pipeline. Pipeline length does affect IPC, but as a result of cycles wasted during mispredicted branches. Yonah's branch predictor is a couple of generations more advanced than the G4's, which means it can easily cover the latency introduced by its longer pipeline.

Yonah's pipeline is probably at least twice as long as the G4's: the P6 pipeline is 10+ stages and the latest version added at least 4 more.

If the branch predictor is good enough, it really doesn't matter.

Oh, yeh, Yonah isn't all that new a design. It's another spin on the venerable P6 core, with SSE3 and a better shared cache this time around.

The distance between the P6 and Yonah is a lot bigger than the distance between the E600 core and the 74xx core. P6 versus Yonah is more like PPC 604e versus G4 than G4 versus MPC8641D.

The benchmarks I've seen comparing the G4 (even with the e600 core) with the Core Duo are all based on the 7448 and earlier, and all of those are crippled by the slow memory bus. These results are consistent with Yonah's main performance advantage being memory speed.

Yonah's main performance advantages are the micro-ops fusion (which make loads/stores relatively cheap because in many cases they don't chew up issue bandwidth), the awesome branch predictor, and the dual load/store pipelines. All these things are more important for integer code than the raw memory bandwidth of the processor.

When you take a system with a single core and a 166 MHz memory bus (like the Powerbook) and compare it to one with dual core and a 533 MHz memory bus of course it's going to get its ass kicked... but the benchmark it really got toasted on was stream...

It got toasted on SPECint too (a factor of 2x at the same clock-speed = toasted), and SPECint isn't that memory-bandwidth sensitive. If it was, POWER5, with 36MB of cache and 16GB/sec memory bandwidth would totally toast the Opteron with 1MB of cache and 6.4GB/sec of memory bandwidth. However, even per clock the POWER5 isn't much faster than the Opteron.

Take an MPC8641D and put memory on BOTH of the 768 MHz memory busses and if it doesn't beat Yonah I'd be very much surprised.

Why would you possible be surprised? It's like putting a PIII on a 667MHz bus and expecting it to outperform a Pentium-M. The cores themselves are a couple of generations apart.

PS> The memory bus is 667MHz for the MPC8641D, not 768MHz.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-09 10:27 · Score: 1

The G4 core is no slouch.

It is compared to Yonah.

With a short pipeline it does very well and doesn't require a sophisticated out-of-order processor to get good performance.

OOO processing isn't so much a remedy for a long pipeline as it is a remedy for cache and memory latencies. These things aren't any lower on the G4 than they are on Yonah.

This is similar to the problems of the P6 core (which Yonah is based on) before Intel grafted a faster FSB to it.

Yonah is a complete relayout of the P6 core, with new FSB, new branch predictor, reworked cache interfaces, and more relaxed decoding rules, as well as new techniques like out-of-line stack management and micro-ops fusion. It is a direct descendent of the P6 core, no doubt, but its an extremely refined version with much more per-clock performance. The 8641D, in contrast, is a much more straightforward transition of the G4 to a dual-core design.

I would like to see the source of your SPEC comments.

Apple :)

SO far as I know there haven't been comparable benchmarks of the MPC8641D and Yonah on similar hardware. You can't generalise from the performance of the 7448 because it's still crippled by the slow bus of the previous G4s.

I don't have results, but I can make some educated guesses based on the architectures of the two processors. Yonah is, in many respects, state-of-the-art. The G4 isn't state of the art in any respect save AtliVec. Will an integrated memory controller make the G4 faster? Undoubtedly. Will it make it twice as fast at the same clockspeed? Unlikely. Will it make it comparable to Yonah at the same clockspeed? Again, unlikely. Will Motorola be able to scale its short-pipe design to the same clockspeeds as Yonah/Merom? Again, unlikely. You can only get so far with a design that hasn't been fundementally changed since the 1990s!

Re:Eye candy can make sense on Novell Makes Public Release of Xgl Code · 2006-02-08 15:08 · Score: 1

Neither API supports curves at the lowest levels. Both handle curves by tessellating them to polygons then sending the generated vertices down the transform and rasterization stages. Hardware tessellation isn't used that often on OpenGL, however, since OpenGL-capable platforms (unlike the embedded platforms targetted by OpenVG), are able to do flexible and high-quality tessellation in software.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-08 13:56 · Score: 1

No they didn't. The new dual-core G4 is fairly primitive technology by today's standards. The G4's out-of-order execution abilities is limited, its got very shallow issue queues, it can only do one load/store per cycle, etc. Yonah is, per core, twice as fast in SPEC as the current G4 (which is consistent with the performance of both chips relative to the Pentium III). Sure, the "enhancements" in the MPC8461D will boost performance somewhat, but do you honestly thing Freescale can double the IPC without fundamentally changing the architecture? IPC aside, its projected to run "above 1.5 GHz", and "up to 2.0 GHz on a next-generation process", while Yonah is already at 2.16 GHz and will scale to 2.33 GHz before the MPC8461D is even released. Merom is going to blow both of them away, with another 10-20% more performance at the same clockspeed, and clockspeeds scaling way past 2.33 GHz at 65nm.

That is not to say the MPC8461D doesn't have its uses --- it only dissipates 25W at 1.5 Ghz on a 90nm process, and should dissipate about half as much as Merom when both are at 65nm. For its target embedded market, this is a substantial advantage. However, on a laptop, that difference is less marked. Below 30W, the CPU isn't the main user of power in the laptop anymore (the LCD, graphics, and storage devices are) and decreasing the CPU's power draw further won't net significant gains in battery life. Apple needs both high performance and low power, and Freescale can't provide that.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-08 08:34 · Score: 1

The Mac market is quite diverse, including everything from home users (the bulk of the market), to educational users, scientific users, and media professionals. Vector processing can be useful to part of the Apple market (media professionals), but AltiVec itself is not really crucial except to a very specific market. API's like CoreImage/CoreData allow programs to leverage the immense vector processing capabilities of the GPU, and can replace AltiVec in many scenarios, such as image processing and even audio processing. Where AltiVec is really essential is in situations where one needs both the flexibility of a general purpose processor and the vector capability of a DSP. Such scenarios exist (doing complex analysis of FFT'ed data in real time would be an example), but it is not often the case that the vector and logical computations of a program cannot be broken up to fit the CoreImage/CoreVideo model.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-07 14:04 · Score: 3, Insightful

The game console cores suck. They are 2-issue in-order designs with crappy branch prediction. Initial reports suggest that they are barely fast enough on integer code to keep the FPU fed, and that's with low-level gaming code. God help you if you're trying to run generic, unoptimized C code on it.

It's 2006 --- no programmer of desktop/workstation/server programs is going to spend time optimizing their code to make up for a flawed processor design. It's 2006, and a few things have happened that apparently no-one told the "Cell on the desktop" folks about:

1) Programs are becoming platform-agnostic. Especially at the workstation/server level, many important applications run on multiple platforms. This often means they are not highly optimized on any platform. This was always one of the things that held the G5 back --- it's high theoretical performance was often nullified by its reliance on tight, well-scheduled code tuned to its idiosyncracies. Super-optimized apps is a luxury few users have. Hell, as an engineer, much of the code I write runs in Matlab's JIT. You think that does G5 optimizations? A processor that does not run all these minimally-optimized apps well is not going to fly on the desktop/workstation.

2) The world is moving towards higher-level languages and higher-level programming constructs. If your CPU can't run machine code with whatever optimizations the JIT can spit out in 100 milliseconds, it sucks. As someone who does a fair bit of programming, I love the Opteron for one reason: it doesn't care how much my code sucks (from a performance standpoint). It lets me write clear, clean code, and runs it with decent performance. I don't have to drop into SHARK to figure out why my 5-issue processor is behaving like a 2-issue one because of instruction scheduling issues, I don't have to sacrifice virgin blood on the alter of code alignment, and I don't have to bust out Altivec to get good FPU performance. Programmers in the desktop/workstation/server markets have gotten used to processors that serve the software, not force the software to serve the hardware. A 2-issue in-order core is not going to fly with them.

3) Vector performance has largely become irrelevent except in a few markets. Yonah has shitty vector performance, and nobody in x86 land really cares. Most desktop CPUs these days spend their time running integer logic code, or double-precision floating-point, letting the heavy vector lifting be handled by the GPU. As API's like CoreImage/CoreVideo take off, things like VMX and AltiVec will become still more irrelevent, except perhaps to those people running FFTs all day long.

Re:Apple had its own reasons... on Apple Switched Chips Too Soon? · 2006-02-07 12:32 · Score: 4, Interesting

Apple's claim that Intel won on watts has been thoroughly discredited in the press and in the blogosphere.

It has been discredited everywhere except in reality. IBM had no good competitor to Yonah and Conroe. The G5 was a long-pipeline, high-frequency design, and it just plain ran too hot for a laptop. Yonah is offering integer performance competitive with the top 970MP, with a power budget 1/3 the size and a CPU die about half the size. POWER6 is just another step in the wrong direction as far as Apple is concerned. It's got a higher frequency, longer pipeline, lower IPC, and an even worse INT/FP performance balance than the G5 had. It's the Pentium 4 all over again. Perhaps POWER6 will be the Pentium 4 done right, but no matter what, its not going to be a good chip for Apple's machines. Especially when you consider what will happen when you take a long-pipeline (inherently bandwidth hungry) design like POWER6, which is optimized for 32GB/sec of memory bandwidth and tens of megabytes of cache, and stuff it into a PC system with 8GB/sec of memory bandwidth and a power envelope of 60W.

Re:48 pixel pipelines on ATI All-In-Wonder X1900 PCIe Review · 2006-02-04 10:54 · Score: 4, Informative

They didn't jump from 16 to 48 pixel pipelines. The x1000 cards have a fairly non-traditional architecture. Instead of having a fixed set of pixel pipelines with fixed resources, they have a large shader array, running a number of rendering threads. ALUs are assigned to each thread as necessary. The X1900 increases the number of shader units from 16 to 48, but both the X1800 and X1900 have 16 texture units and 16 raster-op units. So both cards can do 16 texture lookups per clock, and commit 16 pixels to memory per clock. Where the extra ALUs in the X1900 come in handy are for complex shaders, where the X1900 can do far more calculations per pixels than the X1800.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 12:15 · Score: 1

The terminal emulator version lacks GUI menus, and the GTK2 version doesn't have anti-aliased text for the actual editing area. The GTK2 version is additionally not complient with the GNOME HIG. Emacs.app, on the other hand, blends quite nicely into the OS X GUI.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 11:40 · Score: 1

I'm unfamiliar with SciTE, but if it has parens matching and syntax highlighting for Common Lisp, it should be okay. Regarding Allegro: there is a trial edition that's downloadable for free. http://www.franz.com/downloads/trial.lhtml. It has a heap limit of 23MB, though, which means its unusable for some things.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 11:37 · Score: 1

Tell me about it. It wasn't even modern C++ (which I find at least tolerable). It was C++ in the style of the CORBA C++ binding, which is one of the most hidious APIs ever inflicted on man.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 10:46 · Score: 2, Informative

It's been done. It's called Dylan. It's got much of the power of Lisp, much of the speed of C++, and has a conventional syntax. Apple was pushing it for awhile in the early 1990s. Unfortunately, it never really caught on. The Lisp users didn't like the new syntax, and the project's distancing of itself from Lisp, while the mainstream didn't like it because it was created by a bunch of Lisp gurus. The Apple Dylan implementation suffered from being too memory hungry (like most OOP languages) to fit properly in a Newton, and lost out to a competing C implementation of Newton OS. Ultimately, the language was killed (within the company) when Apple closed its Cambridge Development Lab. Today, there are two existing Dylan implementations maintained by the Gwydion Project. The first, Gwydion Dylan, is derived from CMU's Dylan Compiler while the second, Open Dylan, is derived from Harlequin/Functional Objects's compiler.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 10:35 · Score: 1

What editor are you using? Lisp really requires a good editor to get the most out of it? If you're on Windows, there are a couple of good Lisp environments, namely Allegro's personal edition. Linux is probably one of the worse platforms to do Lisp development in, since the native Emacs is sub-par in terms of features like anti-aliased fonts, etc. Personally, my favorite Lisp development platform is OS X, since OpenMCL, Emacs.app, and SLIME make one heck of a free Lisp IDE. It's got the creature comforts you'd expect from a modern IDE (syntax highlighting, which helps the parens blend into the background, automatic indentation and parens-matching, which is a must-have for Lisp editing, Safari-integrated API documentation, etc). It's also got some features that are unusual in other editors, like expression-based editing an interactive code evaluation.

Re:Lisp not accessible? on Beyond Java · 2006-02-01 09:45 · Score: 2, Informative

I have to concur. After several years of programming C++, switching to Lisp was an awesome experience. It's a very productive language, has very fast compilers that can generate native code, and has some excellent development tools (Emacs + SLIME). For people who can get away with using it on a project (ie: you're not forced to make yourself easily replaceable by using a "commodity" language), I seriously recommend at least an enthusiastic attempt to learn it. "Enthusiastic" is the key word here. If you go in skeptical, you'll never get past the fact that its different from what you're used to. You'll get hung up on the parens, the naming conventions, etc. Common Lisp was designed to be a programmers language first and foremost. Thus, you really can't appreciate its power and flexibility until you write something substantial in it.

Re:A softer, kinder Linux... on Google Working on Desktop Linux · 2006-01-31 13:04 · Score: 1

normal people never, ever, ever ever want to deal with the command line

That's a bit overstated. When I do tech support for me dad, he prefers it when he can get away with me telling him simple command line statements instead of the horrendous "okay, um, find the thing that looks like a hat, and click, no, I said a hat, what you don't see a hat?"

Re:3 Word Summary of Practical Mono on Practical Mono · 2006-01-30 17:10 · Score: 4, Funny

Arguing C# versus Java is like arguing about the color of the turd you stepped in.

Re:Performance good enough for games? on Practical Mono · 2006-01-30 17:01 · Score: 1

People keep bringing up game programming, but its a poor example. Games are usually not written to be maintainable over the long term, and as such tend to use very ugly code to wring the last bit of performance out of the hardware (though this is getting more rare now as the GPU becomes more of a bottleneck than the CPU). If you're doing processor-specific scheduling, asm inner loops, etc, C# won't offer you the same level of performance. However, C# will get you most of the way to the same level as clean, portable C code built with some care towards performance. For most tasks, even many performance intensive ones, that's enough.

Re:That's such a calous opinion... on Court Rules Burning Porn = Making Porn · 2006-01-29 08:32 · Score: 1

I really think you missed my point completely. What I was getting at is not that these oppressive laws about sex are right, but rather that they do serve their purpose. However, the fact that they are useful does not make them right. Laws need not only be useful, but they need to be just.

Re:Why? on Court Rules Burning Porn = Making Porn · 2006-01-28 20:21 · Score: 2, Interesting

Even the 10,000 figure reported in the BBC article would put the AIDs incidence rate at the level of a developed country like the UK, instead of at the level of developing countries in its economic class. Like it or not, harsh attitudes and laws about sex do a good job of preventing diseases like AIDs from spreading, especially in poor countries where access to contraceptives and early diagnosis is limited. However, just because the laws (and punishments!) are useful doesn't mean the laws are just, which was the point I was getting at.

Re:The question here on Court Rules Burning Porn = Making Porn · 2006-01-28 13:51 · Score: 1

This case has implications for any other case that might use it as a precedent. The precedent here is not "burning a CD of child porn equates to distribution of child porn", but rather, "burning a CD equates to distribution". When some lawyer digs up this case to go after a file sharer who burned an archival CD, he's not going to make the distinction between music and child porn, because logically there isn't one. If burning one type of thing to CD constitutes distribution, burning another type of thing also constitutes to distribution. Precedents aren't limited to the specific case in question --- they can have implications across the whole of law.

Re:Why? on Court Rules Burning Porn = Making Porn · 2006-01-28 13:41 · Score: 3, Interesting

Theoretically, the justice system in this country is based on justice, not on prevention. That's why we have a principle that says punishments should fit the crime, and don't just subscribe the death penalty for everything.

Consider adultry or fornication. In most places, its a minor offense, punishable by a fine. We could be like Afghanistan, and punish it by stoning. That is the most effective way of deterring people from committing the crime, and indeed has desireable social ramifications (repressive Muslim countries have a *far* lower AIDS incidence rate than their economic class would suggest, and indeed places like Afghanistan and Iran have lower rates than developed nations like the United States). We don't do this, however, because such a punishment would not be just.

Re:AMD64 on Intel and HP Commit $10 billion to Boost Itanium · 2006-01-28 08:17 · Score: 1

Actually, it may be easier develop more efficient compilers(for the itanium) for the higher level languages than it would be for 'C'. Might be, I don't develop compilers.

With Itanium, you're stuck writing complex code generators just to get the relative performance of the Itanium up to what simple code generators can achieve on other processors. The effort spent on the code generator represents a lot of development effort that could be focused on optimizing higher level constructs instead.

Re:AMD64 on Intel and HP Commit $10 billion to Boost Itanium · 2006-01-27 08:11 · Score: 1

The primary issue with Per, Python, and Ruby is that they use interpreters, instead of native-code compilers. Smalltalk and Lisp usually use native compilers, which speeds things up quite a bit (between 50% and 100% the speed of C code). However, there is still research for these languages, namely in allowing the compiler to better optimize code written in a high-level style.

Re:AMD64 on Intel and HP Commit $10 billion to Boost Itanium · 2006-01-27 06:25 · Score: 1

What's better than C? Lot's of things: ML, Dylan, Python, Lisp, Scheme, Haskell, Ruby, Objective-C, Smalltalk, the list goes on. After programming in C++ for many years, and now getting to work on Lisp code, its obviously clear to me which one I prefer to code in. But my point isn't to tell people they should be using Lisp. My point is that if you're into advanced compiler research, its kind of pointless to spend your effort getting C code to run as fast on a VLIW as it does on a RISC. It doesn't advance the state of the art any, it just allows you to use a crappier processor to get the same sort of performance you do now. Working on high-level languages, on the other hand, advances the state of the art. It makes it possible to use high-level languages where previously low-level ones would've been used for performance reasons. More generally, it goes back to the point that processors are cheaper than programmers. Developing advanced compiler technology to make it possible to get away with a cheaper processor, instead of developing advanced compiler technology to make it possible to get away with using less programmer time, well, it seems backwards to me.

Slashdot Mirror

User: be-fan

Comments · 8,382