Cell Architecture Explained
IdiotOnMyLeft writes "OSNews features an article written by Nicholas Blachford about the new processor developed by IBM and Sony for their Playstation 3 console. The article goes deep inside the Cell architecture and describes why it is a revolutionary step forwards in technology and until now, the most serious threat to x86. '5 dual core Opterons directly connected via HyperTransport should be able to achieve a similar level of performance in stream processing - as a single Cell. The PlayStation 3 is expected to have have 4 Cells.'"
..it probably is.
was the ps2 the supercomputer it was said to be...?
the author goes on to suggest that cell workstations would smoke x86 counterparts.. but says at the same time that there probably wont be that many of them.
wtf? though in-between the lines you can read at the end that he also thinks a single g5-cpu workstation would 'smoke' x86's...
world was created 5 seconds before this post as it is.
Something that has always confused me in gaming consoles is that, despite incredibly powerful hardware (processors, graphical chips, etc.), the system developers seemingly always neglect to put in enough RAM for most games to perform to their potential. Many PC ports often have portions compromised due to the lack of RAM, and system speeds also suffer because of this.
Seeing how RAM is increasingly becoming cheaper, is it possible that new systems like the PlayStation3 might be able to provide RAM that actually allows games to reach their potential along with this new cell hardware?
remember before it was released, how powerful they said it was, so powerful and revolutionary it may not be exportedto some country, as it could be used to guide missile and stuff...etc...
whell, usual PR stuff, sony is very good at it, what amaze me is that it works again and again....people never learn they say...
I'll believe it when I see it. Sony made outrageous claims with the PS2 in the year or so before launch, I see no reason to believe this will be any different.
On paper an Emotion Engine was supposed to destroy everything, but achieving maximum throughput was difficult and other contraints such as I/O and memory hampered performance. Programmers had to learn a very different way of programming to make full use of the processor and it's two vector units.
A Cell might be a killer chip on paper, but real-world hardware with I/O latency and memory contraints will bring things down to a more reasonable level. Don't forget that multiprocessor programming is *hard*.
Hopefully, developing software for Cell chips will be easier then the early days of the PS2, Sony has already said as much a few months ago.
Quotes from article:
"GPUs will provide the only viable competition to the Cell but even then for a number of reasons I don't think they will be able to catch the Cell."
Did this guy forget that NVidia is designing the GPU for PS3? If Cell is so almighty, why does Sony uses NVidia GPU instead of using more Cells for graphic prosessing?
"There is another reason I don't think Nvidia or ATI will be able to match the Cell's performance anytime soon."
Of course, Cell based products won't be available anytime soon either. According to the current rumors, PS3 will be available in Japan in Spring 2006 and elsewhere in Autumn 2006. One and half years equals a generation in the GPU world...
I love this kind of articles where some future products are compared against current ones and declared as a clear winners...
"No Apps"? Try every single video game publisher in the world.
.. Sony or MSFT... I'd say its absolutely no contest. Sony would crush MSFT. They have better interface design, fewer conflicting platform goals, and they'll put a PS3 in your living room for a fraction of what MSFT could.
And besides, this isn't about "Office" style apps. Its about games, and more importantly: its about home media centers. I think the Windows MCE is going to have its rear-end handed to it by the PS3.
When you consider that a cell-based PS3 could have a computational power of *several times* a 3 GHz Pentium...
You have to ask, what's more likely: that Intel can get around IBM/Toshiba patents in time for Windows to conquer the living room with a faster box? (That's if they can even build a secure, stable OS with a decent UI). Or that Sony, now armed with the worlds fastest consumer-computing platform, an enormous user base and years of TiVO experience, will own the living room media center market.
If I had to bet on who builds a better media-center PC
------ The best brain training is now totally free : )
This sounds like a little PVM-cluster-on-a-chip. It also sounds like it's a pain to program and will, in the short term, suffer from the same problems that Intel's Itanium suffers from: it tries to push too much work on the compiler or software developer.
In the long term, it's nice that companies are exploring these kinds of architectures. It's not nice that they are trying to monopolize what are pretty straightforward architectural choices with patents. This may be a new CPU, but there is little that is new about having a bunch of fast processors interconnected via a reconfigurable network; these just happen to be on the same chip.
Indeed, sounds to me like Sony's marketing behemoth is getting into top gears promoting cell in any way possible. Although this might not be directly connected to Sony. Wild claims and theorecal performance papers have been wrong in the past when it came to yet another product with mind blowing specs(Crusoe anyone).
No matter how well a processor or group of processors can run tasks concurrently it will always come down to the fact that most tasks are serial in nature and will not scale to a concurrent processing architecture. Aside from this developing multi-threaded software is extremely difficult and is rife with problems. Just ask any developer about the hardest problem to find/debug. It is pain incranate and some MT bugs can take 5+ days to find. People design serially, because a lot of tasks are essentially serial in nature, and until this design paradigm gets a major shift and we design parallel only software [LOL] then cell has no future.
Who cares? Mac OS X and Linux will provide all the applications required. Windows apps will be likely be available under emulation. The Windows market will still dominate but there will be a gradual migration when people realise there are cheaper/better realistic alternatives available at last.
It's not crap; we produced release versions of our graphics software for Windows on x86, PowerPC, MIPS and Alpha at one point. Shipped some, too. We had machines for all four architectures (still have them, in fact, though the Alpha and PowerPC's are mothballed), development tools, and working Windows OS's on all of them, and they all ran Windows NT, approximately the same version. Perfect, definitely not -- but Windows under x86 isn't perfect either. It worked well, certainly no worse than the x86 versions. We still use one of the MIPS machines as a backup file server. It refuses to die.
Now, I'm no fan of Windows, but if you think MS couldn't port Windows to another architecture beyond x86, you're only fooling yourself. They can any time they want to, they have already, three times that I know of for certain, not counting whatever credit you want to give Windows CE ports, if any, and there you have it. For all I know there may have been ports to 68k archtectures... I wouldn't be in the least bit surprised.
You have to consider that MS has more money than anyone, and if they decide to go this route, there is no reason to think they cannot do it. I doubt there is any market force, including Sony and the largest governments in the world, that could put a serious roadblock in front of them in this arena.
I've fallen off your lawn, and I can't get up.
I'm sorry, but Sony can kiss my ass.
This is from the company that said the Playstation 2 would have Toy Story quality graphics, and be able to render FF8 quality FMVs in real time (thus making FMVs no longer required). It was essentially that bullshit hype that killed the Dreamcast... so yeah, now they're at it again.
Maybe I'll be proven wrong, but I doubt their system will be able to do anywhere near what they say it can in practical application.
I'm willing to believe that a 4.6 GHz chip with 8 ALUs and high bandwidth memory would be fast, but even in bulk, there's no way they can afford to put 4 of them in a sub-$500 game console.
I've been reading PR about the Cell for years, and nothing I've ever read has seemed even remotely plausible. Is there any objective information that even comes close to substantiating any of these claims?
WHat most slashdotters fail to realise is that a vector processor would blow chiunks if used for general purpose processing and hence you won'e be seeing a desktop based on them anytime soon.
Sure they may have vecotr processing in the form of SSE etc but vector processing is of no use tfor word processing period!
It only performs like 20 opterons in highly parallelisable tasks. Which excludes almost every task performed on the average PC, with the exception of some gaming graphics tasks (which, incidentally, are performed on specialised GPU's which vastly outperform x86 cores for their tasks anyway). Most of the time, a single cell core will perform pretty much identical to the single Power chip that controls it.
While I tend to agree the Cell is an impressive architecture, this article is a steaming pile of B.S.
No cache for CPUs? A breakthrough? Hello! Both PSone and PS2 have the so-called scratchpad, which is what the Cell seems to have: a cache which has to be managed explicitly by the programmer. Breaking news: This is a royal pain in the ass. And calculating bandwidth when reading from this tiny scratchpads makes about as much sense as calculating the speed at which a x86 processor can execute MOV EAX, EBX.
Magically "the OS solves everything", and, in an obvious attempt to automatically get OSS-crowd support (is that "slashdot-trolling" or "slashdot-baiting"?) the triumph of Linux is predicted, because it's portable. Good luck getting the Linux kernel and GCC compiled, let alone running well on a massively parallel array of tiny CPUs without cache.
You have not read it. It will be on a specific class of tasks. It is similar to modern GPUs. They are faster then 10 opterons on a specific task.
Back to the article. The guy seems to understand hardware, but he does not understand shit about software. Once he got past the first 3 parts he started babbling. Linux on cell, so on, so fourth. If he just read his previous parts he should have hit himself on the head. The only type of linux this can run is mcLinux. There is no memory protection as such. So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
Similarly, it is all nice and well about cell software beasties making herds by themselves and cooperating on a task. I am going to be a spoilsport and ask a nasty question: Err.. What about a security model? Memory protection? Privilege model for communications? So on so fourth...
To continue on this, the power of a modern general purpose OS is the task switching. How long does it take to load and store the context of the vector processing units? Doing so requires moving their dedicated memory to main memory. This will take ages.
Overall, this is a design similar to Cray 1 initial design. Cray initial design smashed the IBM, DEC (and lesser fish) monopoly on big computing iron to bits. Unfortunately the next thing the people buying the Cray asked for was "can we share this resource between two people?". The answer was provided eventually, but by the time Cray could do all the nifty time sharing and memory management tricks necessary to do this its advantage was no longer phenomenal. And all people who could use Crays for single tasks with manual scheduling actually continued to use it that way. But it did not even dent the general purpose big iron market.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
There is memory protection. Read the whole thing. What I think bit you was the fact he said there was no virtual memory... well even then his wording is confusing as virtual memory is just swaping out pages of memory as you need more. This can be done on the Cell. What I think he is talking about it adress translation. Paging hardware must not implement a full LogicalAddr==>LinearAddr==>PhysicalAddr paging/segmentation unit(I have not read the patent myself). He mentions that during runtime the adress must be physical/real and that, when running on an APU, they may be given access restrictions. I must regress though and tell you that I am no expert either. The OS is in for quite a bit of work when dispatching apulets as i can see adjusting addressing and other things will be as interesting (or more) as different scheduling mechanisms are today in current systems. To get a secure system out of this will require protected memory and if i remember correctly the Cell may be capable of running multiple OSs in parallel VMs. This can be explained by considering that IBM has their own software layer that ones OS would talk to (at least the article made it seem that way). Its amlost like having a micro kernel (or exokernel in some ways) that then have real things atached to it. Like linux for example. Linux can already be run in user mode and even ontop of the L4 micro kernel. Linux has shown to be portable enough (along with most good modern software). I would not have any doubt in seeing this happen with IBM.
ruby -le"32.times{|y|print' '*(31-y),(0..y).map{|x|~y&x>0?'
85 Celcius operation with heat sink
Well, perhaps "cool!" is not the correct response...
It says with a heat sink only. Not with a fan!
The last chip that worked without a fan was the 486DX33 and
486DX40(I'm talking mainstream desktop PC hardware, not mobile solutions). You could probably stick a fan and get it down to
40 degress, while a Pentium 560 will produce liquid plasma and/or a fusion reaction if operated without a fan.
P.
Maybe you (and others) haven't noticed, but the desktop PC is a deer in the headlights. Game machines will take over before you can say 'service contract'.
Pft. People have been saying this every time a new console generation is coming. When the upcoming Playstation 2 was hyped, some people were claiming it would easily emulate a PC at many times the speed of an x86. When it came, people couldn't take full advantage of the hardware. When they could some years later, PC hardware had surpassed it. Besides, people value the flexibility of a PC. In other words, bs then, bs now.
Being bitter is drinking poison and hoping someone else will die
RTFA. He covers that in explicit detail.
This part I agree with. His statements regarding abstraction are just flat out incorrect. Is this going to be programmed in assembly only? I think not...and if not there is significant abstraction involved. The thing that's closest to his point is that multiple *layers* of abstraction tend to add significant overhead. That doesn't mean that program-level abstractions do.
Once he got past the first 3 parts he started babbling. Linux on cell, so on, so fourth. If he just read his previous parts he should have hit himself on the head. The only type of linux this can run is mcLinux. There is no memory protection as such. So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
There is memory protection if the PU is in fact "something like a G5". IBM would have to be insane not include a MMU, and it has already stated that it's going to build workstations based on the Cell architecture.
All in all, interesting stuff...we'll see how it plays out. :-)
To continue on this, the power of a modern general purpose OS is the task switching. How long does it take to load and store the context of the vector processing units? Doing so requires moving their dedicated memory to main memory. This will take ages.
This, of course, depends on how many cells are in the box (with 8 vector units per cell) and how many tasks need vector units. The main purpose of the vector units in an interactive workstation will be multimedia processing. How many multimedia applications can you view at once? For me, the answer is one. The vector units may be useful for other things like engineering simulation and pattern matching, but once again how many different tasks using those features will be running at once? Plus if the processors are cheap enough to put 4 in a Playstation, one hopes the workstations will have 8 to 32 of them.
Overall, this is a design similar to Cray 1 initial design. Cray initial design smashed the IBM, DEC (and lesser fish) monopoly on big computing iron to bits. Unfortunately the next thing the people buying the Cray asked for was "can we share this resource between two people?". The answer was provided eventually, but by the time Cray could do all the nifty time sharing and memory management tricks necessary to do this its advantage was no longer phenomenal. And all people who could use Crays for single tasks with manual scheduling actually continued to use it that way. But it did not even dent the general purpose big iron market.
Two points. First, this is based on an already successful processor - the Power series. It already multitasks :-) and is used in a wide range of applications. Second, this will be a low-cost part. Crays were a super high-end system, which cost millions of dollars. Your analogy doesn't work.
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
Secondly: anyone that buys a PC to play games on has more money than sense and is quickly parted from the latter.
TWW
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Yes, because the guy has a goofy website he must be ignored!! He likes linux, too! He must be a complete crackpot and because he is making a joke about his nose all of his ideas must be bullshit!
God, I love being a anal asshole. And my shit don't smell either!!!
This was not a technology article. That was a "I for one, welcome our new cell processor overlords.." article.
I don't see anything in the cell arcitecure that would fundamentally make the same number of transistors at the same speed operate faster. I see lots of bottlenecks, IO overhead and wastet transistors. If there is some magical powerful thing that these can do SO much better than the current X86 instruction set and hardware, guess what, it'll adapt.
x86 adapted to RISC being "wildly faster" and, in the end, became better RISC than RISC was by translating more memory efficent X86 instruction onto a RISC backend. It adapted to SIMD (Single Instruciton, Multiple Data) efficiency issues by adding MMX/MMX2/SSD/SSD2 and 3DNow. It adapted to the reality of 64 bit address space and the need for more registers with the new X64 instruction set extensions. AMD and Intel could add cell hardware and instructions too if they offered anything special, which I highly doubt they will.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
Wonder if they have also been working to optimize Linux for the CELL processor? I for one will be watching this very closely...
The NSA: The only part of the US government that actually listens.
In other words, media data and processing algorithms will be behind an impenetrable DRM hardware wall. "Cell programs" (the little vectorizable data manipulators) will be trade secretes. Outsiders that want to program something new will only be able to string together DRM approved cells. For example, there might be an approved MPG6 cell that will report meta-data found initially in a MPG6 stream but Rights Management interests will never permit any cell that exports all of the MPG6 data.
Why does the recommended single chip PE (processing element) include 8 DPUs? My guess is that a certified library of Cell Programs will not allow anything to be sent off chip that is not strongly encrypted. Thus one might have an 8 DPU chip where 3 are used to decrypt the input, 2 to do the actual processing, and 3 are used to encrypt the output. This off-chip disadvantage is a strong reason for putting multiple PUs and their 8 DPUs on one chip - If intercommunication between Cells cannot be detected externally then there is no need for the encryption/description stuff.