IBM-Sony-Toshiba Reveal New Cell Processor Details
BBCWatcher writes "The three main partners in the Cell Processor initiative announced technical details of the new architecture. IBM's documents are particularly revealing. There's much more information on how developers, including open source developers, can access the SPUs (Synergistic Processor Units). As reported earlier, Sony will put the Cell into every Playstation 3 game machine, due early next year. And yes, Cell runs Linux."
So when can i buy a 'pc' based on these things...
Or even a development board..
---- Booth was a patriot ----
And yes, Cell runs Linux.
I just assumed Linux ran on everything, including iPods
Thanks! I still got a little soul suckage from that site, though. Check out the EULA - talk about broad (luckily I can do these things once I'm no longer using the site):
Prohibited Conduct
Following acts are not allowed when using this Web Site:
(1) Infringing the legal rights (including, but not limited to, the rights of privacy and publicity) of SCEI and/or others
(2) Causing any damages or disadvantage to SCEI and/or others
(3) Disturbing public order
(4) Criminal act
(5) Defaming, disgracing or libeling SCEI and/or others
(6) Uploading files that contain viruses or corrupted files that may damage the operation of SCEI's and/or others' computers
(7) Activities that are unlawful or prohibited by any applicable laws
(8) Any other activities that SCEI deems inappropriate
HIV Crosses Species Barrier... into Muppets
What's to strip? Linux is just the kernel - as I'm sure you've read here before. I run a 'stripped down kernel' - I don't build parts of the kernel I don't need. You probably do the same, but maybe without realising.
Earlier in the design, the SPU's were called Streaming Processing Units (you know like SSE, Streming SIMD Extensions). However, they didn't want to give the impression that the SPU's were designed only for "streaming data" kind of tasks, so they decided to change its name.
;)
:)
I guess "SPU" had already stuck with the developer team, so they just switched the word to "some meaningless word with S" so they could keep the acronym. And as far as meaningless words with S go, "Synergistic" fits the bill quite nicely.
After the fact, of course, they can let the marketroids make up explanations on how the name is actually about the "synergy" between the main processor and the SPUs, blah blah blah...
The filesystem is the package manager
How do you program those SPUs, besides hand-coded assembly ? For media / game apps, it's probably acceptable to handcode vector instructions for the performance-critical parts, but for everything else you're going to use - at best - the 2 generic execution contexts and the SPUs will sleep idle.
The Raven
Not to be a buzz kill, but it looks like we'll have to wait for a lot of development and middle ware maturity before we see the real potential in cell processors.
Yes, but why worry about something so trivial when we've got anti-gravity technology?
http://www.blachford.info/quantum/gravity.html
And faster than light travel?
http://www.blachford.info/quantum/fastlight.html
Blachford is just as qualified to talk about processor technology as he is about physics. He's an attention seeking charlatan lacking either the experience or qualifications to contribute anything but hype and bullshit. And he's becoming just as ubiquitous and irritating as that Piquapelle prick.
Modern DMA engines frequently allow you to store DMA descriptors in a section of memory usually in the form of a list. You then provide the starting address of the list to the DMA engine, maybe twiddle some bits and of the DMA engine goes and processes the list element by element. The command s in the list can get really fancy depending on the DMA engine. You should read the documentation the article talks about and find out about it, it seems to be a good example of a fancy DMA engine.
Oh and that list is sometimes called chain.
Je me souviens.
Probably isn't quite as in-depth, though.
How the fuck did a post which explicitly states it has less information that the main story get modded Informative?
- w00t?
...without the G5 AltiVec enhancements or any instruction reordering, or any of the things that make a G5 cool.
Why would anyone engrave "Elbereth"?
I just downloadded all of the Cell pdf's to take a look at them. I posted the following analysis to news:comp.arch:
...
Naturally, I started reading the SPU asm manual, and that makes it
immediately obvious that this is a cpu directly targeted at MPEG style
video processing:
absdb Absolute difference of bytes
avgb Average bytes: dest = (a+b+1) >> 1 (MPEG interpolation)
ct Carry Generate: Target = carry out of (A+B)
addx Add word extended: Target = A+B+(Target & 1)
Notice the last one! It uses the least significant bit of each part of
the target register as input to an AddWithCarry operation, which means
that you need three read ports.
This pair of opcodes seems to me to be meant as building blocks for
extended/arbitrary precision calculations.
It has a full set of branch instructions that as a side-effect either
enable or disable interrupts, i.e. critical sections are supposed to be
handled this way.
It seems to handle sub-register size operations with a set of opcodes,
where one of a group of GenerateMask operations is used to generate an
input mask for a general shuffle operation.
There's a bunch of generalized three-input FMAC opcodes, all working on
SIMD data, like fnms (T = Acc - (a * b).
It has fsqest and frest to generate approximate reciprocal square root
and reciprocal lookup values. However, these operations does not seem to
deliver results in a standard format, instead each resulting element
consists of two parts, a base and a step, so that a following fi
(Floating Interpolate) can improve upon the table lookup results.
I'm guessing you'd then want one NR iteration to get somewhere close to
IEEE single precision.
The shufb (Shuffle bytes) opcode seems like a small extension to the
Altivec Permute, in that in addition to using 5 bits to select one of 32
possible input bytes, and can also specify three different immediate
values (0, 0x80 and 0xFF), which would be needed to make it work with
the GenerateMask operations mentioned above.
All in all a pretty general set of opcodes for SIMD data processing, it
is particularly obvious in the way each of the possible operations has
forms to work on either a set of input data (reg or immediate), or on
it's complement. This saves a lot of bubble-introducing mask setup
operations, but is normally not considered to be required on a regular cpu.
Terje
"almost all programming can be viewed as an exercise in caching"