IBM Releases Cell SDK

← Back to Stories (view on slashdot.org)

Posted by Zonk on Thursday November 10, 2005 @04:27AM from the toys-while-waiting-for-the-next-gen-consoles dept.

derek_farn writes "IBM has released an SDK running under Fedora core 4 for the Cell Broadband Engine (CBE) Processor. The software includes many gnu tools, but the underlying compiler does not appear to be gnu based. For those keen to start running programs before they get their hands on actual hardware a full system simulator is available. The minimum system requirement specification has obviously not been written by the marketing department: 'Processor - x86 or x86-64; anything under 2GHz or so will be slow to the point of being unusable.'"

11 of 207 comments (clear)

Min score:

Reason:

Sort:

Source for actual chips? by mustafap · 2005-11-10 04:47 · Score: 3, Interesting

Thats great news, but as an embedded systems designer and eternal tinkerer, where will I be able to buy a handfull of these processors to experiment with? Without having to dismantle loads of games machines ;o)

--
Open Source Drum Kit, LPLC deve board - mjhdesigns.com
What about a PPC SDK and simulator? by kuwan · 2005-11-10 04:49 · Score: 4, Interesting

As the Cell is basically a PPC processor I find it strange that the SDK is for x86 processors. Fedora Core 4 (PowerPC), also known as ppc-fc4-rpms-1.0.0-1.i386.rpm is listed as one of the files you need to download. Maybe it's just because of the large installed base of x86 machines.

It'd be nice if IBM released a PPC SDK for Fedora, it would have the potential to run much faster than an x86 SDK and simulator.

--
infested with jello like fishes no melotron wishes
GNU toolchain by lisaparratt · 2005-11-10 04:50 · Score: 5, Interesting

The software includes many gnu tools, but the underlying compiler does not appear to be gnu based.

Is this any surprise? My understanding was the Cell's a vector process, and despite the recent upgrades to GCC, it's still fairly awful at autovectorisation.

Can anyone clarify?
Re:Wikipedia article question by AKAImBatman · 2005-11-10 04:55 · Score: 2, Interesting

Cell isn't a System-On-A-Chip. It's just a stripped-down, in-order power pc core coupled to 8 single-purpose in-order SIMD units, using an unconventional cache/local memory architecture

You know, I'm looking back at all these replies to the poor guy, and I can't help but think that he's sitting in front of his computer wondering, "Can't anyone explain it in ENGLISH?!?" :-P

For instance, you have to unroll your "for" loops to start, since those SIMD co-processors can't do loops.

Actually, we need a new programming model. Instead of using FOR loops, we need a model under while you can say, "Perform these instructions X number of times." One could probably do a bit of guess-work in the compiler based on loops like "for(i=0;i<COUNT;i++)", but that doesn't help cases where the loop uses a more complex conditional statement (or where the test is affected by the loop itself). Thus the language needs to be changed to force the programmer to pre-compute the loop length for maximum performance. For example:
int i = 0; do(COUNT) { /*code goes here */ i++; }

--
Javascript + Nintendo DSi = DSiCade
Re:Since the submitter didn't bother to explain... by Anonymous Coward · 2005-11-10 05:22 · Score: 0, Interesting

Where these dumb comments come from:

1) Apple tries to lowball IBM on the mobile 970 design
2) IBM give Apple the finger - they account for less the five percent of IBM's chip volume
3) Steve goes out on stage and pretends like he has made the 'choice' to move to Intel
4) With Cell processors in Macs no longer an option for Apple, the sour grapes meme that the idiot above parroted starts to make its rounds in Mac circles.
5) Intel's processor roadmap fiasco continues, but what is funny is how Intel's roadmap for future chips years down the road has chip designs that look very close to STI's Cell chips that being made today.

Enjoy your h.264 encoding times on those wonderful Intel SSE chips Mac crazies!
Re:Linux on PS3? by MaskedSlacker · 2005-11-10 05:37 · Score: 2, Interesting

Almost definitely. A cheap beowulf of PS3s.
Re:Wikipedia article question by AKAImBatman · 2005-11-10 05:37 · Score: 2, Interesting

mov ecx,b
shr ecx,2
loop:
add eax,[ebx]
add eax,[ebx+4]
add eax,[ebx+8]
add eax,[ebx+12]
add ebx,16
dec ecx
jnz loop

With SIMD instructions, you can execute all four of those adds in one instruction. I wish I knew SSE a bit better, then I could rewrite the above. Sadly, I haven't gotten around to learning the precise syntax. :-(

However, there's a fairly good (if not a bit dated) explanation of SIMD here.

--
Javascript + Nintendo DSi = DSiCade
Rosetta to the rescue? by Caspian · 2005-11-10 05:45 · Score: 2, Interesting

'Processor - x86 or x86-64; anything under 2GHz or so will be slow to the point of being unusable.'

OK, so what they're saying is "it's slow to emulate a PPC variant on an x86 variant". Duh.

But Apple seems to have cooked up something wonderful (or at least licensed something wonderful) in this vein in the form of Rosetta, the tech that lets Mac OS X for x86 run Mac OS X for PPC binaries very fast.

Sony has several metric fucktons of money. Can't they license the Rosetta technology, or pay for it to be basically "ported" from its current state of PPC-on-x86 to Cell-on-x86? Cell is PPC-based, so it shouldn't be so hard, no?

--
With spending like this, exactly what are "conservatives" conserving?
1. Re:Rosetta to the rescue? by Hal_Porter · 2005-11-10 07:04 · Score: 2, Interesting
  
  Apple wrote a great 68K emulator for the PowerPC macs. It was non JIT, and worked like a big jump table. So you took a 16bit 68k instruction, shifted it and jumped to the base of the table + the shifted offset. The code there would essentially be a PowerPC version of the 68K code.
  
  http://www.mactech.com/articles/mactech/Vol.10/10. 09/Emulation/
  
  So you end up doing four instructions to decode the 68K instruction, and then whatever it takes to actually do the operation, typically 2-4.
  
  JIT emulators would profile the code and check which bits were frequently executed. Then they would essentially copy the table entries into a buffer. So in a loop, you'd actually execute native just execute the 2-4 native instructions and skip the table dispatch.
  There's another benefit too, you can skip things like condition code updates, if you know that they will be overwritten by another instruction before they are checked. Plus you can do peephole optimisations, constant folding and so on.
  
  There's a wonderful article here -
  
  http://www.gtoal.com/sbt/
  
  I can easily believe that CPU intensive code like image processing can run at a very impressive speed, especially as top of the range x86 chips have better SpecInt perormance than a top of the range PPC.
  
  Incidentally, I read about Apple's second generation 68K emulator being a "dynamic recompiler", so they've been working on this sort of thing for ages.
  
  --
  echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:Echoes of Redhat by Anonymous Coward · 2005-11-10 06:19 · Score: 1, Interesting

Must be a case of 'brand leakage' from a distant past, one that held Redhat as the most popular desktop Linux distribution.

uh, or maybe... 1) it's because IBM has a partnership with RedHat, 2) Fedora runs on PPC (which CBE is based on) so i'm sure it's easy for them to modify, 3) there's a good chance this was developed using FC4, so it's just easy to release it for FC4
Re:Wikipedia article question by hr+raattgift · 2005-11-10 08:59 · Score: 2, Interesting

Ah, OK, I had to think about this a bit... please correct me if I'm still misunderstanding you.

I now think you were using a simile or making an analogy to argue that compilers can benefit from careful construction of loops in the source code.

If so, then of course I agree with you.

Saying this in a much more general way: careful choice of syntax can make the semantics more clear to the compiler.

A high level language with "dotimes (count) { action }" syntax lets the compiler make good choices about loop unrolling and the counter's type.

A language where you have to test and modify your own counter lets the writer make good or incredibly awful choices about loop unrolling and the counter's type.

This version:
foo() { double d = 1.0; int x=1; while(d > 0) { x = x << 1; d -= 0.1; } return x; }
is semantic brain-damage on a system with very slow very IEEE doubles, and loop-unrolling this naively is not going to help.

A compiler which realizes that this is a loop whose length is constant can unroll the loop fully, partially, or simply use a better/faster iterator like an integer. But should we end up with 0x400 or 0x800?

Haha, now throw side-effecting at your smart compiler by
inserting a debugging
printf("d: %G, x: %x\n", d, x);
into the while loop ... how should it optimize that?
... d: 0.2, x: 100 d: 0.1, x: 200 d: 1.38778E-16, x: 400 d: -0.1, x: 800
Right?

Anyway, I think we're not really disagreeing. You can write loops stupidly, whether they're iterative (as above) or whether they're recursive. A compiler probably can't save you if you are particularly stupid. It might even make things worse.

For what it's worth, when I say your sentence to myself, I want to make the like bold, I guess to emphasize the simile.