Mr+Z · Slashdot Mirror

Re:You're thinking too literally on What's the Worst Acronym You've Ever Heard? · 2002-03-02 14:34 · Score: 1

Must be my cold. Usually I would've gotten a joke like that. :-) Thanks.

--Joe

I'll bite. on What's the Worst Acronym You've Ever Heard? · 2002-03-02 11:24 · Score: 1

I don't get it. What is the significance of K9P?

--Joe

Re:kids today play too many video games... on 40th Anniversary of Video Games · 2002-02-28 07:39 · Score: 2, Funny

Ok, so one time me and my friend spent so much time playing tetris (the cool 2 at a time, race mode thingy on the ol' nintendo) day.

Wow, you must've had the Tengen version (for NES), not Nintendo's version. I envy you.

--Joe

Re:Why delay the hybrid? on It's (Almost) Hammer Time · 2002-02-26 10:56 · Score: 1

Depends on the ABI. If your ABI keeps sizeof(long) == sizeof(int) and sizeof(int) == 4, then you're ok. (This presumes you'll rely on long long for 64-bit integer objects.) What'll break for that model are apps that assume sizeof(long) >= sizeof(void *).

The most likely model will probably make long 64 bits and keep int at 32. Anyone read up on the ABI for x86-64?

--Joe

Re:Will this hammer... on It's (Almost) Hammer Time · 2002-02-26 10:45 · Score: 1

Not like Intel is that much better. Heck, according to the designers here, Itanium apparently burns 35 Watts in its clock tree alone!

--Joe

Freudian Slip on HTTP's Days Numbered · 2002-02-26 07:42 · Score: 1

Again, with emphasis on the inadvertant Freudian slip:

"Microsoft has some ideas (on how to break the independence on HTTP), IBM has some ideas, and others have ideas. We'll see," he said. But, he added, "if one vendor does it on their own, it will simply not be worth the trouble."

I'm sure they meant "break the dependence on HTTP" (which, for applications that don't fit HTTP's model but are using it anyway, that's a good thing). But saying "Microsoft has some ideas on how to break independence" is just beautiful. :-)

--Joe

Re:Do what? on NOA to Sue for Flash Advance Linkers · 2002-02-26 06:26 · Score: 1

Thanks. I was wondering about that myself, and the EFF's FAQ actually cleared things up.

--Joe

Re:Self-documenting code, etc. on Computing Pet Peeves? · 2002-02-26 06:22 · Score: 1

In this case, it was being used as the starting term in a polynomial evaluation for Chien Search. I just looked at the code, and the constant is not quite 1, but rather 0x01010101 (because the code is SIMD-optimized, and is doing the same computation in parallel on four different bytes in one word).

In this context, I would've named the constant something like "ones", or "one", or maybe even k01010101, rather than a five-syllable name. (Rarely do you need the word 'constant' in a variable name, and 'one' and 'unity' are usually synonyms.) In fact, many of the variables in that file could've been given more terse, yet more descriptive names. I'm firmly convinced it's an art.

Of course, I suppose you could go the other way and have all 2 thru 4 letter variable names, as I did in the IDCT (same page as above). Of course, in that case, I was mostly using the names that Chen used in his paper on the IDCT. (I implemented Chen's decomposition of the 8-pt IDCT.) For other things, though, I have no qualms calling a rounding term rnd or an input pointer i_ptr. Calling it inputPointer is a waste of my 80 columns.

--Joe

Re:it's a cool method on Factoring Breakthrough? · 2002-02-26 05:59 · Score: 2, Informative

Or more correctly, the new algorithm operates in the cube-root of the time of the original. (I'm pretty sure factoring is still an exponential search problem. Would someone who knows this algo better than I comment?)

At any rate, it's not quite as impressive as if an exponential search had been made polynomial. Rather, the exponent in the exponential search's runtime has been divided by 3. (Still a very big deal.)

In terms of big-Oh, it went from O(x^N) to O(x^(N/3)).

--Joe

Re:Self-documenting code, etc. on Computing Pet Peeves? · 2002-02-25 15:01 · Score: 1

Good advice for a huge loop (what the hell is some_ptr->foo[i] 600 lines down and five indents in?), but not necessary in small code, I would think.

Agreed. Indeed, most variable naming schemes fall over at the endpoints, and what you're really left with is a matter of taste at some point. I gave one of the guys here a ribbing because he had essentially this in his code: const unsigned int unityconstant = 1; Uh, yeah.

I personally have no problem with short, meaningful names. Don't make me typeArrayOfStructuredElements[IndexIntoArrayOfStructur edElements]->CounterFieldInStructure += ConstantValueOne when a[i]->cnt++ is sufficient.

--Joe

Re:Errors on Computing Pet Peeves? · 2002-02-25 14:34 · Score: 1

Mac has a similar one (that I encountered often when using MS Word 4 on it back in the System 7 days):

Application Unknown has unexpected quit with Error -1

It's only a half step removed from putting up a dialog box that says "Something bad happened. [OK]"

--Joe

Re:Do what? on NOA to Sue for Flash Advance Linkers · 2002-02-20 15:31 · Score: 1

Bullshit. Otherwise, then how come Robert Crumb lost his rights to "Keep on Truckin'"?

A six-panel page in Zap #1 that caused Crumb a lot of trouble, KOT struck a note in the collective hip unconscious. For a while, KOT was everywhere. The characters and their odd mode of pedal ambulation were made into merchandise, mostly without permission. In the early '70s, Crumb's lawyer threatened suits against anyone who had swiped Crumb's work or ideas. Thousands of dollars rolled in. Then in 1976 a judge ruled that Crumb didn't own KOT --and suddenly he was being pursued by the IRS for the taxes they said he owed on past royalties. Crumb didn't dig himself out of that hole for years.

--Joe

Re:Here's an illegal but fun use for this tech... on NOA to Sue for Flash Advance Linkers · 2002-02-20 15:14 · Score: 1

16MHz ARM vs. 3.5MHz 65816. I think the GBA has a bit of an advantage over the SNES, any day.

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-20 02:21 · Score: 1

athlon is a hardware-level instruction translator--32bit x86 pentium instruction set to riscops

By that argument, the original 8086, which was a microcoded machine, just translated x86 code to its internal microcode and then executed that. Therefore, the original 8086 (for which x86 is named) doesn't run its own code natively. I don't buy it. It's a dead argument.

Really, the primary difference is that Athlon spends all of its transistors focusing on running IA-32 code well, since that is its native programming interface. Itanium has its own native interface that it devotes its transistors to, IA-64, and its support for IA-32 is much, much weaker. Thus, we feel that Athlon's IA-32 not an emulation, and Itanium's is.

now, if intel can get ia-64 instructions with high levels of parallelism and few dependencies out of x86 instruction stream without making their chip even more gargantuan is doubtful...

They'd essentially have to put a simplified version of their compiler's back end on the die (a'la Transmeta and their code-morphing software) to get any kind of decent performance. They're much better off leaving that stuff to the OS or something.

--Joe

Re:The article wasn't clear on TI Lands OMAP in a Pocket PC. · 2002-02-19 07:30 · Score: 1

ARM != StrongARM.

This is very similar to how Athlon != IA-32. Don't confuse an architecture with an implementation of that architecture.

--Joe

Re:Architecture vs. OS-- we REALLY should know bet on Seti@Home Bandwidth Problems · 2002-02-18 16:42 · Score: 0, Offtopic

Ok, so when I get Quake III Arena for Linux, it'll run on your 68K Mac? Oh, what's that? Software targets specific hardware too, and it isn't enough to identify software's target by its intended operating system?

I think what you really meant to say in your rant, is that software targets a platform, and a platform consists not only of the hardware (Gateway G6-300 PC, Apple G4 PowerMac, SunBlade 1000, SunBeam Toaster), but also the OS running on it (Windows / Linux, Mac OS / Linux, Solaris / Linux, George's Custom 30-word RTOS).

You're dealing with slang here. When people say PC without any further qualifiers, they mean "the typical realization of the PC hardware platform running the current mainstream operating system for that hardware." (Which, right now, typically translates to a Wintel box.) We all use shorthand for common phrases. Get over it. At least we're not asking "Does this computer have the Internet on it?"

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-18 15:56 · Score: 1

ther is absolutely no reason to assume that an 800MHz Itanium emulation of a Pentium IV can't beat the performace of a 800MHz Pentium IV with a well-written emulator.

I agree. But we're not talking about writing an emulator (which would seem to imply software-level recompilation of IA-32 code into IA-64 to reach the potential you describe). We're talking about the hardware-level instruction translation that Itanium has. Hardware-level instruction translators have much less opportunity to take advantage of the many resources of the native CPU.

Given that, the assumption that "work-per-clock" for IA-32 on IA-64 is comparable or worse than IA-32 on IA-32 seems more than reasonable.

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-18 15:51 · Score: 1

The only real "solution" I could see to get around the slow x86 emulation would be to put a seperate, higher clocked x86 chip outside the Itanium (oldschool math-coprocessor style) and sync the two processors somehow (now that would be a chipset designer's hell!)

Shouldn't be too bad if both CPUs had the same bus protocol. If done right, it could look like SMP, even though it's asymmetric.

A different approach might be to give the IA-32 CPU a simpler bus that's not as aggressive on performance. That'd simplify system design, while still allowing x86 apps to run reasonably. Since the OS and graphics code are likely on the IA-64, the resulting system should still have decent performance, and could even have reasonably large performance benefits over a single IA-32 system. (That is to say, IA-32 code running on an IA-32+IA-64 combo machine should run about as good or better than IA-32 code running on an IA-32-only machine, even though in the combo machine, the IA-32 subsystem isn't nearly as optimized for performance as it is in the IA-32-only machine. The advantage comes in having the OS and graphics drivers running on the IA-64.)

At any rate, asymmetric multiprocessing isn't new. I program DSPs for a living. DSPs often end up on a board with many other DSPs and a completely different "host CPU" that drives them all. (That's what cellular base-stations look like under the hood.) If that isn't asymmetric multiprocessing, I don't know what is. :-)

--Joe

Re:Fine and good, but Itanium is hard to JIT for on What's Next in CPU Land after Itanium? · 2002-02-18 15:34 · Score: 1

I don't buy it. You should be able to do a pretty decent job, even on the fly. Otherwise Transmeta wouldn't have even half a chance. As it is, they're doing pretty good -- maybe not bleeding edge performance, but very acceptible for running essentially "JIT x86". Transmeta and Itanium both are VLIW machines, and so it's reasonable to expect the hurdles involved aren't gigantically different.

Granted, to get ultimate performance out of Itanium, you need to jump through some fiery hoops. (I program a VLIW machine at my day job, so I'm familiar with VLIW's quirks.) But to get "pretty good" performance, it's not too bad. And that's what JIT needs. There's lots of low-hanging fruit.

Another thing you fail to take into account is that dynamic compilers have many opportunities to improve on the generated code that static compilers lack. (That was the primary insight of the Dynamo project I linked before.) For Itanium, this is especially true -- a dynamic compiler could get feedback information on which loads miss the cache, which branches are mispredicted, etc. and twiddle the code appropriately. A static compiler would just have to guess, or accept profile information from previous runs.

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-18 15:23 · Score: 1

In short, I agree that MHz != performance. Before you dismiss my original post out-of-hand, though, my argument wasn't complete bunk.

Itanium's claim to fame is its notable lack of hardware rescheduling and related toys. It's an in-order machine in just about every way, and it relies on static scheduling at compile time to achieve good performance.

In contrast, x86 machines nowadays are highly out-of-order machines, and they rely heavily on register renaming and other hardware tricks to get decent performance. They spend an awful lot of transistors on just figuring out which instruction to execute next.

In all likelihood, these tricks are NOT implemented in the IA-32 -> IA-64 translation engine on the Itanium die. It just doesn't make sense. So, it's extremely likely that real IA-32 work-per-clock numbers for IA-32 code running on IA-64 are strictly less than or equal (with the emphasis on less-than) to the work-per-clock numbers for IA-32 code running on IA-32, especially for CPUs at the same clock rate.

If you agree with that assumption, then consider the best-case scenario where work-per-clock is EQUAL on both platforms. An 800MHz Itanium can therefore be no better than an 800MHz Pentium III.

Now consider the clock rate advantage that newer CPUs have. You'd have to be off by a factor of 3 on work-per-clock to not have a significant advantage with a 2200MHz part. That's all there is to it.

1.5GHz Athlons get ahead of 2.2GHz Pentium IVs because they get about 1.5x as much work done per clock. That's pretty significant (and says more about how the compilers *aren't* there for Pentium IV than anything else), but it's still a VERY far cry from 3x.

Basically, my point is that I understand that MHz != performance and that work per clock is very important. I was pointing out, though, that Itanium has TWO very steep hills to climb to run IA-32 code successfully: It's harder to get decent work-per-clock for IA-32 workloads running on IA-64, and IA-64 has a significantly lower clock rate than leading-edge IA-32. So, in the performance equation "work = work_per_clock * clock_rate", IA-64 comes up a stinker on both terms of the equation.

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-18 15:02 · Score: 1

Who would ever do a cycle-for-cycle perfect emulation of anything, when the point is just to mimic the -functionality-? It'd be stupid. Computer architechts are the only ones who need that accuracy -- every one else settles for just functional correctness.

Basically, what I was trying to say is "assume that you somehow map the x86 instructions onto the Itanium instruction set in hardware so that the total cycle count (ignoring memory system effects) for the translated instructions is no greater than the original code under any circumstance." In that scenario, you'd still be 1/3rd the speed of the flagship CPU. Now, things don't look *quite* so bad once you start throwing memory system effects in there, but for things that fit in the L1s, you're hosed.

It's unrealistic to think that the hardware translator will close that 3x gap.

--Joe

Re:Itanium vs. Hammer vs. All Others. on What's Next in CPU Land after Itanium? · 2002-02-18 10:11 · Score: 2, Insightful

When Apple transitioned from the M68K line to the PPC, they were in the same situation - 68K code would run faster on a 40Mhz 68040 than on a 40Mhz PPC 601. The reason consumers didn't mind was that the the PPC 601 started at 60Mhz (approximately the break-even point to the emulation layer), and (to the end user) didn't cost significantly more.

While that's a valid point, it also bears pointing out that Pentium IV is at 2200 MHz whereas Itanium is at 800MHz -- about 1/3rd the clock speed. That ratio is going to remain for awhile too -- McKinley will come out at 1000 MHz, while Pentium IV continues its mad march toward 3000MHz and beyond. You acknowledge this fact implicitly with your next statement (re: Itanium not viable until approx same speed at approx same cost), but I felt it'd be interesting to point out just how large a gap there is.

These ratios spell doom for hardware-level emulation of the Pentium on the Itanium. Unless Intel has some serious magic, having a 100% cycle-for-cycle perfect emulation of the Pentium III or even Pentium IV on the Itanium die will never run better than 1/3rd the speed of the real thing, since the fundamental clock rate is so far off. The only real way to get close is to do a software-level translation and get a boost from scheduling for the native hardware.

It's interesting to note, BTW, that HP's Dynamo project does a software translation of PA-8000 code targeting (guess what) a PA-8000 CPU, and rather than slowing things down, it actually gets 20% speedups! Ars Technica also did a piece on this. Perhaps that's why HP doesn't have hardware-level translation from PA-RISC to Itanium on the die like Intel does -- they (HP) are in a better position to just translate the PA-RISC code to IA-64 when needed. (Also, in the UNIX world, it's just simply less necessary.)

--Joe

Re:Once had to same problem, on Determining Color Difference Using the CIELAB Model? · 2002-02-17 15:07 · Score: 1

Yeah, your "formula" doesn't work too well for colors around #808080, does it?

--Joe

[OT] comment editing on How Many CDs Can You Burn at Once? · 2002-02-15 17:18 · Score: 1

Even better -- why not disallow comment editing if the post has already been either (a) moderated, or (b) replied to? And disallow commenting or moderating a comment while it's marked "edit in progress"? (Bound the latter state by a fairly short timeout -- say, 1 or 2 minutes -- and if the timeout is exceeded, convert the edit to a followup.)

There are race conditions, but those too can be solved. A reply that was started relative to the original (while a parallel edit gets made) would get attached to the original, not the edited version. (The window in which this could happen should be fairly small, since once you click on "Edit Post", it should block replies.) Once the edited version is committed, the original and any followups that arrived in the short editing window are buried behind a "[See original (%d followups)]" link.

Moderations applied to a comment that gets edited while it's being edited remain attached to the original. Again, the window here is small -- if the post is marked as "edit in progress" when the mod points are being applied, the moderation can instead just be dropped and ignored. The timing windows only arise due to concurrency issues when you have multiple servers and stuff.

--Joe

Re:x86-64! on Linus Merges ALSA Into 2.5.4 · 2002-02-13 16:47 · Score: 1

I think the idea of a whole separate arch directory for x86-64 is that it really is a different architecture that will have different assembly code and rather different OS-level issues when it comes to context switches and so on. I'd imagine its conceptually similar to the ext2 vs. ext3 directory split.

--Joe

Slashdot Mirror

User: Mr+Z

Comments · 3,254