Cell Architecture Explained
IdiotOnMyLeft writes "OSNews features an article written by Nicholas Blachford about the new processor developed by IBM and Sony for their Playstation 3 console. The article goes deep inside the Cell architecture and describes why it is a revolutionary step forwards in technology and until now, the most serious threat to x86. '5 dual core Opterons directly connected via HyperTransport should be able to achieve a similar level of performance in stream processing - as a single Cell. The PlayStation 3 is expected to have have 4 Cells.'"
It's not like we haven't heard it before. It usually turns out to be halfish-truish for some restricted subset of operations in a theoretical setting, you know where you discount busses, memory and latencies.
Yeah, but can my inkjet print them?
Be relentless!
I dont know about you but I'm looking forward to putting linux on one of these.
a DBZ reference: "Part 4: Cell Vs the PC"
How long until the first beowulf cluster of ps3s?
Also can it run linux?
..it probably is.
was the ps2 the supercomputer it was said to be...?
the author goes on to suggest that cell workstations would smoke x86 counterparts.. but says at the same time that there probably wont be that many of them.
wtf? though in-between the lines you can read at the end that he also thinks a single g5-cpu workstation would 'smoke' x86's...
world was created 5 seconds before this post as it is.
Judging by your user name, it seems you are using outdated technology. x86 can no longer get the first post!
Something that has always confused me in gaming consoles is that, despite incredibly powerful hardware (processors, graphical chips, etc.), the system developers seemingly always neglect to put in enough RAM for most games to perform to their potential. Many PC ports often have portions compromised due to the lack of RAM, and system speeds also suffer because of this.
Seeing how RAM is increasingly becoming cheaper, is it possible that new systems like the PlayStation3 might be able to provide RAM that actually allows games to reach their potential along with this new cell hardware?
Haha you got robbed at 3am in the morning...!
WTF are you talking about?
Is the integration of multiple processors into a single 'unit' the answer to the alleged woes of pushing a single processor as fast as it can go? I seem to be seeing more and more consumer-oriented solutions that involve multiple processors. Is this simply due to reduction in costs, or is it destined to become the norm? Some technologies (i.e. Intel's hyper-threading) appear to be a prelude to the latter.
remember before it was released, how powerful they said it was, so powerful and revolutionary it may not be exportedto some country, as it could be used to guide missile and stuff...etc...
whell, usual PR stuff, sony is very good at it, what amaze me is that it works again and again....people never learn they say...
I'll believe it when I see it. Sony made outrageous claims with the PS2 in the year or so before launch, I see no reason to believe this will be any different.
On paper an Emotion Engine was supposed to destroy everything, but achieving maximum throughput was difficult and other contraints such as I/O and memory hampered performance. Programmers had to learn a very different way of programming to make full use of the processor and it's two vector units.
A Cell might be a killer chip on paper, but real-world hardware with I/O latency and memory contraints will bring things down to a more reasonable level. Don't forget that multiprocessor programming is *hard*.
Hopefully, developing software for Cell chips will be easier then the early days of the PS2, Sony has already said as much a few months ago.
Well if even a Playstation 3 is going to be the equivalent of 20 Opterons, I imagine you could emulate an x86 in software (a la Virtual PC) and it'd still be as fast as, oh, a few Opterons? :)
I don't know if I can take this article seriously. A games console due out within the next year or two is going to be as powerful as 20 of our current top of the range chips? I'm not buying it.
"Some will no doubt be turned off by the fact that DRM is built into the Cell hardware. Sony is a media company and like the rest of the industry that arm of the company are no doubt pushing for DRM type solutions. It must also be noted that the Cell is destined for HDTV and BluRay / HD-DVD systems, any high definition recorded content is going to be very strictly controlled by DRM so Sony have to add this capability otherwise they would be effectively locking themselves out of a large chunk of their target market. Hardware DRM is no magic bullet however, hardware systems have been broken before - including Set Top Boxes and even IBM's crypto hardware for their mainframes."
Huh?? what are you smoking? Since when does Apple emulate x86 harware? Perhaps you are confused by the fact that you can buy Microsoft Office for the macintosh, or run Internet Explorer. Heres a news flash - they aren't emulating the x86, they are native mac code.
Most likely you are confusing one of two different things. Back in the OS X programs can be either Cocoa (the new way) or Carbon (the old way) apps. But that isn't really emulation. What is emulation is that Mac OS X can run OS 9 by emulation - nothing to do with x86 here.
Or you can buy Virtual PC, which is now owned by Microsoft. It will allow you to emulate x86 to run Windows (or Linux) and associated. But note - this isn't Apple, and it isn't something that Mac users "have to do".
Lastly, why on earth would you HAVE to be able to run windows programs in order for a Playstation processor to be successful? Last time I checked PlayStation was still wildly successful? More so than MS Xbox I think.
Quotes from article:
"GPUs will provide the only viable competition to the Cell but even then for a number of reasons I don't think they will be able to catch the Cell."
Did this guy forget that NVidia is designing the GPU for PS3? If Cell is so almighty, why does Sony uses NVidia GPU instead of using more Cells for graphic prosessing?
"There is another reason I don't think Nvidia or ATI will be able to match the Cell's performance anytime soon."
Of course, Cell based products won't be available anytime soon either. According to the current rumors, PS3 will be available in Japan in Spring 2006 and elsewhere in Autumn 2006. One and half years equals a generation in the GPU world...
I love this kind of articles where some future products are compared against current ones and declared as a clear winners...
The article...describes why it is...the most serious threat to x86.
This is a line (excerpted) from the writeup. These two chips (x86, Cell) aren't even in the same ballpark, let alone market.
Even Apple, with their much vaunted G-chips have to emulate the x86 hardware so that users can run their Windows programs.
You sure must be smoking crack or something! Apple has been using Power-PC chips the last almost 10 years! The Cell chip is supposedly based on PPC technologies, so Apple is actually one of the companies that might actually benefit greatly from this chip!
Apple making Windows programs only... PffftT!
In one paragraph (Games):
"The Cell designers have concentrated on raw computing power and not on graphics"
In the next paragraph (3D Graphics):
"Again this is a field the Cell was largely designed for so expect it to do well here"
So which is it???
"No Apps"? Try every single video game publisher in the world.
.. Sony or MSFT... I'd say its absolutely no contest. Sony would crush MSFT. They have better interface design, fewer conflicting platform goals, and they'll put a PS3 in your living room for a fraction of what MSFT could.
And besides, this isn't about "Office" style apps. Its about games, and more importantly: its about home media centers. I think the Windows MCE is going to have its rear-end handed to it by the PS3.
When you consider that a cell-based PS3 could have a computational power of *several times* a 3 GHz Pentium...
You have to ask, what's more likely: that Intel can get around IBM/Toshiba patents in time for Windows to conquer the living room with a faster box? (That's if they can even build a secure, stable OS with a decent UI). Or that Sony, now armed with the worlds fastest consumer-computing platform, an enormous user base and years of TiVO experience, will own the living room media center market.
If I had to bet on who builds a better media-center PC
------ The best brain training is now totally free : )
I am not sure whether I get it or not, but it seems to me that what is revolutionary here is not just the speed, but the convenience of the architecture: what if you had just standard slots in your computer in which you could plug extra processors, graphics cards, memory, optical drive hard drive ... anything, using the same exact bus. That would make it so convenient to upgrade and to scale, that it would take over the market. Especially if you could plug in a i386 module too.
I cannot see Sony losing so much money on their next gen console as to make this a possibility; they'd be needing far too many games purchases to make up the losses. My money's on the technology being a scaled down version of the setup described -- and when I say scaled down, I mean scaled way down. I'd estimate it's at least five, possibly ten, years before something like this is anywhere near economical enough for a gaming console.
Having said all that, I'd love to be proven wrong. But until the console comes out, we really have no way of being sure one way or the other, and I'd rather have relatively low expectations and be pleasantly surprised than to raise my expectations to Alpha Centauri (at ~4 light years :) and be majorly disappointed.
One way or the other, time will tell.
Unless the thing has an x86 emulation layer, it's dead in the water in regards to the PC market.
If this thing can run as fast as claimed, you can have a x86 virtual machine written in Java and it will still outperform all other Wintel machines
listen to the wise old man:
With great power comes loads of software
http://www.livejournal.com/users/metricmusic
This sounds like a little PVM-cluster-on-a-chip. It also sounds like it's a pain to program and will, in the short term, suffer from the same problems that Intel's Itanium suffers from: it tries to push too much work on the compiler or software developer.
In the long term, it's nice that companies are exploring these kinds of architectures. It's not nice that they are trying to monopolize what are pretty straightforward architectural choices with patents. This may be a new CPU, but there is little that is new about having a bunch of fast processors interconnected via a reconfigurable network; these just happen to be on the same chip.
Well, perhaps "cool!" is not the correct response...
-------
Warning: Slashdot may contain traces of nuts.
Why the hell do you suppose IBM is helping to Pimp linux?!
Linux is a perfectly capable OS, just as usefull to mom and dad as any other OS (once it's setup and runnning)
I am running Linux on a POWER/PowerPC computer right now.
Cell archatecture is a slight departure based on the Power platform. Designed by IBM, manufactured by Sony.
IBM is the kind of hardware virtualization. You think VMWARE is hot? IBM was doing that in the seventies. Right now you can run hundreds of Linux OSes on a single IBM mainframe. The performance loss is minimal. So by combining cells you have the opturtunity to use hardware abstraction to create a very powerfull computer to run a Linux OS on.
Sony has ported Linux to playstation 2 themselves, so they are used to it and know it.
IBM is releasing a Cell workstation that will run Linux so that Sony Playstation 3 developers have something to work on and run tests on.
Linux has been ported to dozens of different archatectures, the defualt 2.6 kernel has been shown to scale to 64 proccessors. It runs on numerous disparent 64bit platforms (Alpha, Sparc, AMD64, Power, Itanium, etc).
Microsoft is still dicking around with porting Windows to AMD64... a platform mostly compatable with x86. (don't give me crap about NT running on Alpha. It ran on 32bit version, and there was a early beta of W2k that ran 64bit native, but the Win32 API and everything else you use on your computer is and always has been x86-only)
you have TONS of software aviable for Linux. All of it is relatively easily portable.
Linux is very commonly found in high end 3d movie making (Shriek, Lord of the Rings, any newer 3d movie was rendered and partially modeled in Linux using custom and high-end tools.)
It's a proven, seasoned platform. Easily portable, cheap, stable, and it exists.
THAT'S why IBM is pushing Linux so much! They are tired of having their decisions dictated by the limitation of Windows and they are doing something about it.
Cell + Linux, a dream made in heaven. (well geek heaven)
If it is true about Cell being so powerfull at graphical work, then within a few years it would make Linux and Cell the dominate platform for 3d work, and eventually games.
Screw Microsoft, Windows is holding potentially holding everyone back.
Or does the logical extension of this chart:
t ributed.gif
http://www.blachford.info/computer/Cells/Cell_Dis
Make it look a little more like a HAL than a Cell?
Maybe you (and others) haven't noticed, but the desktop PC is a deer in the headlights. Game machines will take over before you can say 'service contract', with networked apps and entertainment ingrained.
...those boys are cashing in stock options and selling die to Taiwan as we speak. Last one out turn off the lava lamp.
IBM knows this....Sony knows this....Apple knows this. Microsoft knows it as well, thus the lack of steam behind Longhorn.
Intel?
it wouldn't die, just have a different use.
the article makes so outrageous claims of cell's powers that it makes LITTLE difference if office apps run on it or not. you don't need office apps on your supercomputer.
world was created 5 seconds before this post as it is.
Indeed, sounds to me like Sony's marketing behemoth is getting into top gears promoting cell in any way possible. Although this might not be directly connected to Sony. Wild claims and theorecal performance papers have been wrong in the past when it came to yet another product with mind blowing specs(Crusoe anyone).
* 4.6 GHz
* 1.3v
* 85 Celcius operation with heat sink
In toasters.. ovens..
End Communication.
Now if the can be made very fast and have only a few (2-8) coupled together...well,as it was said, that is what a nice Opteron machine does anyway nowadays.
One question which was not addressed fully in the article was how do you compile/test programs for this thing.
The potential of parallel architectures has never been in doubt since the early days of the Cray monsters - but how to compile code to use all the features efficiently has.
I don't believe that we see the full advantage of these types of architecture exploited without some similar break-through in software tools.
Mind you the hardware rocks...
Sig (appended to the end of comments you post, 120 chars)
No matter how well a processor or group of processors can run tasks concurrently it will always come down to the fact that most tasks are serial in nature and will not scale to a concurrent processing architecture. Aside from this developing multi-threaded software is extremely difficult and is rife with problems. Just ask any developer about the hardest problem to find/debug. It is pain incranate and some MT bugs can take 5+ days to find. People design serially, because a lot of tasks are essentially serial in nature, and until this design paradigm gets a major shift and we design parallel only software [LOL] then cell has no future.
Who cares? Mac OS X and Linux will provide all the applications required. Windows apps will be likely be available under emulation. The Windows market will still dominate but there will be a gradual migration when people realise there are cheaper/better realistic alternatives available at last.
Wow, and just think how the "technology world" 10 years ago was basically identical. I think they could've used a better phrase here.
Especially now that Sony has decided to be a hardware company once again.
It's not offtopic, dumbass. It's orthogonal.
I'm sorry, but Sony can kiss my ass.
This is from the company that said the Playstation 2 would have Toy Story quality graphics, and be able to render FF8 quality FMVs in real time (thus making FMVs no longer required). It was essentially that bullshit hype that killed the Dreamcast... so yeah, now they're at it again.
Maybe I'll be proven wrong, but I doubt their system will be able to do anywhere near what they say it can in practical application.
There's a lot of weird stuff in the conclusions part where he compares the Cell against x86 and PowerPC 970 chips. Look at the Apple blurb for example. He's trying to paint this rosy picture about how Apple can sell a bunch of cloned Macs with Cell processors... that's just foolish. Apple has stuck with the PowerPC architecture for a long, long time now; there's no way they would rewrite OS X (and force everyone to rewrite their apps) just to make a bunch of cheap clones. Not just that, but Apple is now well known for outstanding hardware design. No way they're going to license anything to a cheap, beige-box manufacturer. So from my standpoint, his last page was a big load of BS...
I'm willing to believe that a 4.6 GHz chip with 8 ALUs and high bandwidth memory would be fast, but even in bulk, there's no way they can afford to put 4 of them in a sub-$500 game console.
I've been reading PR about the Cell for years, and nothing I've ever read has seemed even remotely plausible. Is there any objective information that even comes close to substantiating any of these claims?
Clippy needs 2 cells to run!
I think the guy who wrote this is over-hyping the cell, but I wouldn't be surprised if the cell smoked everything out there for things like video, audio, and 3D graphics.
Sony and Toshiba may end up owning the living room, while Microsoft etc. fight it out for the desktop. Everyone in the PC industry thinks that the living room will make them more money...
"Giving money and power to government is like giving whiskey and car keys to teenage boys" P. J. O'Rourke
He's only posting here because he dosen't have 10 friends!
future computer ads might sound like '4x4' all-wheel drive ones; 'transfers bits from the cells that grip, to the cells that slip'
i didn't understand any of the document, but damn it looks fast
Nothing costs nothing
Does anyone remember the Transputer technology? It was also based on some cell computing approach and was also meant to replace the PC architecture one day.
WHat most slashdotters fail to realise is that a vector processor would blow chiunks if used for general purpose processing and hence you won'e be seeing a desktop based on them anytime soon.
Sure they may have vecotr processing in the form of SSE etc but vector processing is of no use tfor word processing period!
It only performs like 20 opterons in highly parallelisable tasks. Which excludes almost every task performed on the average PC, with the exception of some gaming graphics tasks (which, incidentally, are performed on specialised GPU's which vastly outperform x86 cores for their tasks anyway). Most of the time, a single cell core will perform pretty much identical to the single Power chip that controls it.
Also it helps to understand how IBM runs it's mainframes.
You have virtual PCs. These are are programs like VMWARE so that you can run multiple computer OSes.
Well IBM has similar technology, except that while Vmware has been doing it's thing for a while now, IBM has been doing this for almost 30 years. They have gobs of experiance.
So look at the big mainframe. A beast. The CPU isn't all that powerfull but it's main strength is MASSIVE amount of I/O bandwidth. In a PC you have PCI buss that has a maximum bandwidth of 130 or so MBps. A big new IBM manframe has 20GBps worth of bandwidth.
You have the main OS that runs on bare metal. This is called VM. Inside this OS you have things called 'partitions'. These are not like harddrive partitions, but these are proccessing partitions.
In each of these partitions you can have a OS run in them. You can run OS/370 for instance, for legacy batch programs left over from the 80's and early 90's. I am sitting right next to machine that runs a similar setup for JCL programmers.
However you can run Linux on them, too. You can run HUNDREDS of Linux OSes in them, actually.
And the significant thing is that you can do this with almost no overhead. Programs run as fast, or potentially FASTER, then when they are run on "bare metal".
Now look at the Cell archatecture. You have a Power970-style CPU that runs as a controller for all theses special purpose vector proccessors.
You have a VERY high degree of control over system. Unlike x86 there is almost no abstraction...
Take virtual memory (if you think that VM = Swap space, I am talking way over your head) address spaces. You have a abstract memory space, UNTIL it's being used then it's hardcoded and most of the control is done by the host OS...
So this is what I see as the future for the Cell worlstation:
You have the Cell hardware. Ontop of that you have the special IBM OS that controls the hardware. In proccessing partition like you have with Mainframes you have either a Linux or OS X operating system that runs all the user's applications. You can run one OS, or numerious operating systems at the same time.
To the OS X/Linux operating system you will multiple CPUs that it will access. These multiple CPUs will be the 'cells'. To the OS, to the user it will appear to be a run-of-the-mill PowerPC. Just like a IBook or a PowerG5
You will have the ability to render something like similar to the Doom 3 in real time in OpenGL SOFTWARE RENDERING MODE 10x's faster then the fastest PC aviable anytime in the next 5 years.
Hello, folks. This is the speed in which these guys are talking! This isn't oh, Most powerfull PC in the world bullshit talk like from Apple. This shit is the next level. This is star treck shit, this is your-computer-is-now-HAL, type stuff.
FYI Crays come (came) with huge quantities of libraries, preprocessors, compilers, schedulers, profilers, all designed to help efficiently use the hardware.
Don't post comments they say nothing else than "I dont understand."
I think that would make the GeForce 6800 GT/Ultra/Lite/Whatever look like a Voodoo 1 and a half. a 3d mark pro 05 score of about 30000...hmmm I guess I should patent that...wait a minute...never mind.
Read the article you pontificating ignoramus, or are you just trying to seem intelligent without making the effort to actually be so.
Yeah, that's what I thought. You are a worthless dick weed, please go kill yourself and thus vastly improve the mean human intelligence quotent.
Oh yes, please kill your family on your way out as we can do without the gene pool contamination.
In your next life, consider first learning to read, then learning to comprehend what you read, and, lastly learning when & how to appropriately share your perhaps-by-then worthwhile opinion.
Until then, please go back to watching teletubbies whilst sucking your thumb and cuddling your carebear.
Wonder if IBM looks into the future and doesn't see PCs anywhere? Intriguing possibility.
While I tend to agree the Cell is an impressive architecture, this article is a steaming pile of B.S.
No cache for CPUs? A breakthrough? Hello! Both PSone and PS2 have the so-called scratchpad, which is what the Cell seems to have: a cache which has to be managed explicitly by the programmer. Breaking news: This is a royal pain in the ass. And calculating bandwidth when reading from this tiny scratchpads makes about as much sense as calculating the speed at which a x86 processor can execute MOV EAX, EBX.
Magically "the OS solves everything", and, in an obvious attempt to automatically get OSS-crowd support (is that "slashdot-trolling" or "slashdot-baiting"?) the triumph of Linux is predicted, because it's portable. Good luck getting the Linux kernel and GCC compiled, let alone running well on a massively parallel array of tiny CPUs without cache.
You have not read it. It will be on a specific class of tasks. It is similar to modern GPUs. They are faster then 10 opterons on a specific task.
Back to the article. The guy seems to understand hardware, but he does not understand shit about software. Once he got past the first 3 parts he started babbling. Linux on cell, so on, so fourth. If he just read his previous parts he should have hit himself on the head. The only type of linux this can run is mcLinux. There is no memory protection as such. So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
Similarly, it is all nice and well about cell software beasties making herds by themselves and cooperating on a task. I am going to be a spoilsport and ask a nasty question: Err.. What about a security model? Memory protection? Privilege model for communications? So on so fourth...
To continue on this, the power of a modern general purpose OS is the task switching. How long does it take to load and store the context of the vector processing units? Doing so requires moving their dedicated memory to main memory. This will take ages.
Overall, this is a design similar to Cray 1 initial design. Cray initial design smashed the IBM, DEC (and lesser fish) monopoly on big computing iron to bits. Unfortunately the next thing the people buying the Cray asked for was "can we share this resource between two people?". The answer was provided eventually, but by the time Cray could do all the nifty time sharing and memory management tricks necessary to do this its advantage was no longer phenomenal. And all people who could use Crays for single tasks with manual scheduling actually continued to use it that way. But it did not even dent the general purpose big iron market.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
iddqd mother fucker.
>Did this guy forget that NVidia is designing the GPU for PS3? If Cell is so almighty,
>why does Sony uses NVidia GPU instead of using more Cells for graphic prosessing?
One possible reason is the cost. When you can save a large area in a silicon die by using a specilized DSP, why do you waste some processing power in a CPU? nVIDIA can provide a reasonablly efficient solutions such as texture units and pipes toward more specific types of processing. Cost is everything, when you manufacture millions of them. At least Microsoft seems to have learned it by the tremendous loss incurred by the Xbox business.
So, it looks really interesting, but the question is really the price, one of the most expensive part in a processor (at least as it was explained to me) is registers. The article says that the cell processors will be able to have huge amounts of registers (128 * 128bits registers per APU times 8 + a couple for the PU). If the price for registers is so cheap, I do not see what will prevent Intel/AMD to add a technology which could use multiple sets of registers like a MegaHyperThreading coming over the 80% of the time waiting for fetches (I believe Sun already used the same trick in is SPARC processors).
... yet ;-) it has probably still a couple of tricks to play.
The price of memory coming down also means that Intel/AMD will also be able to add a huge on their multi-core processors.
Third thing: concerning power, recent developments and rumors are hinting at partly tickless architecture for next generation processors which would decrease in some significant way power consumption.
Guess the PC is not dead
Guillaume
yea, you are right. I needs to emulate DOS so I can practice my old-skool Rise of the Triad skillz
I think that it seems more likely that someone will licence the Cell and replace the G5 core with an x86 core. And I don't see anything wrong with that; it's the normal way to do things :)
:)
So, I think that you'll find that this is about the embarassingly parallel stuff - compilation, for instance, could be implemented on the cell. So what might happen is compilers compile targeting an x86 or G4/5 with APUs attached. All in all, I think that we'll need some pretty sweet compilers for this cool stuff
Cheers,
Michael
Kids today are tyrants. They contradict their parent, gobble their food, and tyrannize their teachers. - Socrates 400 BC
huahahaha
after reading the article, this cell architecture sounds like a virus breading ground. networking without user intervention? across any kind of cell system? imagine a virus spreading from your home computer to your cell phone to your servers at work. just add some random "menes" and polymorphism and you have enough processing power to create a petry dish of evolving computer virii.
If it's simply a "programmable cache" like you are so quick to claim then that means you could duplicate traditional cache behavior in software.
And in that case I don't really see where the porting problem comes in.
You have to realize, that PCs are no more the high tech in the computer industry. The fastest mass produced CPU on earth was in a PC in the '80s, in the '90s, but not nowadays.
Just like the most PCs were sold for engineering/scientific uses, than this shifted to business administration, and is shifting to households. And will then probably shift to the trashcan. Microsoft was doing all these shifts together with the PC industry, and as the revenue is moving to consoles/dvd players/entertainment centers microsoft is trying to move with it. MS may be successfull. The x86 will die. Applications and sw support was important for engineering, was for office use, is for home/internet use, but is much a simpler question with consoles.
I do not say the PC will die, but it will never more be high-tech. Just like there are very sophisticated alloys, still hacksaws are made of good old steel, because nobody would pay 10 times for the ultra modern alloy. vajk
Not just games too, since it is also good for DSP based activities. Audio processing is a DSP heavy task, imagine the quality of the audio you could get out of these buggers. If you can hack it and load Linux on, then you could use it as the most powerful DSP based soft synth around. Think Nord Modular style power and you'll get my drift. I'm drooling just thinking about it - cheap consumer hardware that could provide the horse power of a full rack of specialised audio gear.
All those moments will be lost in time, like tears in rain.
www.clearspeed.co.uk
Since the main goal of the chip is to pump through graphics, regardless of what device its in, a GPU is better grounds for comparison.
From TFA: "Existing GPUs can provide massive processing power when programmed properly, the difference is the Cell will be cheaper and several times faster."
Its supposed to do 250GFlops when? 2 years from now? Apparently the Geforce 6800 Ultra will do 40GFlops and thats today.... extrapolate with some doubling here and there it seems a lot more reasonable.
So the big thing is that it comes down to programming. It came up a few times in the article "Doing this will make it faster but will make for one hell of a time for the programmers" It may have a huge potential but may take a while to get everything efficiently as Sony would like. Reminds me of when the GF3 first came out and was beaten by the GF2U in some tests. IIRC it took a while for games to come out that took advantage of its programability. It'll be interesting to see how well the programmers can fair between now and Cell's release.
if you bothered to read the article a beowulf cluster would probably just slow down the potential of the cell design. the buss on these is ment to be directly connected to other cells. although it would have problem working over a cluster, it would be limiting, expensive and stupid.
If you think about it, to the vast majority of users (ie joe public) its going to make no difference.
Does it speed up the internet ? No
Does it make Word printer faster ? No
Does it view p0rn quicker? No
Yes its good but it will be of limited use to Joe Public
AROS probably could run on it.
Change is certain; progress is not obligatory.
And boots up instantly and doesn't sound like a Boeing 747 taxiing.
There are several assumptions that lead to tremendous theoretical performance figures. The simple fact is that like the Itanium, the Cell processor depends on some rather complicated software that will solve issues like parallelism, coherency etc. The article clearly states that the Cell architecture is a combination of software and hardware (1st page). This is good because performance can always increase (via a better OS or microcode) but it is also bad because it means that initial versions may not stand up to their performance claims.
;-)
Also, let's not forget that developers will be unable to keep up, unless some highly sophisticated libraries and languages are made available. I really don't expect the majority of developers to be able to cope with massive parallelism from the beggining (not just 2x SMP or hyperthreading, this needs a totally different mindset).
To sum this up: the hardware will deliver, but the software is a critical unknown in the equation. I have faith in IBM
P.
Service contract
Weird... nothing happened.
Change is certain; progress is not obligatory.
from my reading of several websites, i suspect there will be 2+ ps3 consoles. 1 would have all of these cells and do all the backwork, 2 would have 1 cell or 2 and play the game and 3 might just have 1 cell and be used to play ps2 game.
So, what you're really saying is you have no experience and no future in consumer computing devices.
The future:
iPod or MS Media Center?
Hint: the answer is the first one.
PS2/3 or Xbox?
Hint: the answer is the first one.
I read all five sections at once, intending to stream each chapter through separate phases from character recognition to criticism. Unfortunately, every time the article used "it's" in a predicative sense, everything ground to a halt.
Fortunately, cell reading meant I hardly noticed the claim that hardware would compete with the x86 because, unlike the x86, cell computers need all their software written for the specific hardware.
I like how "hardware-specific" becomes "OS-independent". Great I can plug my HDTV into my G/Fs "electrically powered adult novelty device", and harness the extra computing power to find out we are really alone in the world. Of course, no firmware will stand in the way.
I'm also surprised that, in pandering to all the OS underdogs in the slashdot crowd (Great day for Apple, since they like G5s; Great day for Linux, since many obsessive-compulsive coders work on Linux projects anyway), he left out a true lightweight OS designed from the ground up for just this sort of multitasking: Amiga OS 4.0. To get something like this to actually work, you'll need more than iPod huggers, OSX preachers or Linux fans. You need genuine madwomen and madmen. You need AmigaOS.
they have a theoretical computing capability of 250 GFLOPS (Billion Floating Point Operations per Second) [GFLOPS]
It sounds great, can't wait to have a cell system running through my house . . . until he starts using these arguments that don't really lead anywhere?
"The Cell approach does give some of the benefits of abstraction though. Java has achieved cross platform compatibility by abstracting the OS and hardware away, it provides a "virtual machine" which is the same across all platforms, the underlying hardware and OS can change but the virtual machine does not.
Cell provides something similar to Java but in a completely different way. Java provides a software based "virtual machine" which is the same on all platforms, Cell provides a machine as well - but they do it in hardware, the equivalent of Java's virtual machine is the Cells physical hardware. If I was to write Cell code on OS X the exact same Cell code would run on Windows, Linux or Zeta because in all cases it is the hardware Cells which execute it."
Which basically reads "the x86 platform is like a virtual machine, except in hardware, because all the different x86 chips use the same instruction set"
The funniest bit is the 'achieving close to theoretical peak performance'.
I hope to see more bollocks on fox news.
There is memory protection. Read the whole thing. What I think bit you was the fact he said there was no virtual memory... well even then his wording is confusing as virtual memory is just swaping out pages of memory as you need more. This can be done on the Cell. What I think he is talking about it adress translation. Paging hardware must not implement a full LogicalAddr==>LinearAddr==>PhysicalAddr paging/segmentation unit(I have not read the patent myself). He mentions that during runtime the adress must be physical/real and that, when running on an APU, they may be given access restrictions. I must regress though and tell you that I am no expert either. The OS is in for quite a bit of work when dispatching apulets as i can see adjusting addressing and other things will be as interesting (or more) as different scheduling mechanisms are today in current systems. To get a secure system out of this will require protected memory and if i remember correctly the Cell may be capable of running multiple OSs in parallel VMs. This can be explained by considering that IBM has their own software layer that ones OS would talk to (at least the article made it seem that way). Its amlost like having a micro kernel (or exokernel in some ways) that then have real things atached to it. Like linux for example. Linux can already be run in user mode and even ontop of the L4 micro kernel. Linux has shown to be portable enough (along with most good modern software). I would not have any doubt in seeing this happen with IBM.
ruby -le"32.times{|y|print' '*(31-y),(0..y).map{|x|~y&x>0?'
As the articles mentions, there will be 4 cell processors in each PS3. I have been perusing the patents applied for by Sony and can offer Slashdot readers some technical insights as to what they will be used for:
2 Cells to handle the load of processing Sony's new DRM technology aka "Fuck You - Shut Up & Consume Technology" (TM).
1 Cell to search your home network for MP3s and convert them all to ATRAC, deleting the originals.
The remaining Cell will co-ordinate with other PS3s that it finds on the Internet and launch a huge permanent DDOS attack on OSNews. It is expected that the number of articles appearing on Slashdot will drop to approx 10% of the current level at about the same time.
Okay, I made the last one up, but I can dream can't I?
The ever increasing word processing needs are what are driving the development of ever faster PC CPUs and graphic cards....
Oh wait,...
The linux software opengl lib. It is a 100 ton system killer complete copy of a video card done in processor problem when 40 g is tried to be copied by 6g.
Back to the article. The guy seems to understand hardware, but he does not understand shit about software. Once he got past the first 3 parts he started babbling. Linux on cell, so on, so fourth. If he just read his previous parts he should have hit himself on the head. The only type of linux this can run is mcLinux. There is no memory protection as such. So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
Err, no. Linux will run on architectures without MMUs no problem. In fact, the PS3's OS will be based on Linux.
I'm trying to work out how he's getting a 4.6 GHZ G5 on a chip along with all the processing units while apple currently can't make their (singular) chips go above 3 GHZ.
This is a very interesting architecture. It arises quite logically from the 'rediscovery' of vector processing for high throughput, low interdependence instruction streams that GPUs represent. For a long time, home computers didn't do the kinds of things vector processors were good at - they did complex, heterogeneous instruction streams, and the processors evolved to match. You can't run word on a Cray after all. We got microops, RISC, superscalar architectures, multi-level burst filling caches, branch prediction, hyperthreading, all the things that roughly speaking make a stream of instructions that depend on each other's results a lot, don't repeat themselves much, and get information from all over memory go quickly. The thing is, the jobs vector computers are good at started cropping up in home computer loads, mainly for graphics and media type uses. The industry started responding to this with dinkly little SIMD cores in essentially conventional processors, and then we got the GPU, which makes many more compromises to get you greater throughput. Now we have this thing, which is less graphics specific than a GPU but much more vector-heavy than a microprocessor. Most of the 'traditional PC' work, executing code, will be done in the 'supervising' processor I think, which is why it's so fat - you don't need a G5 just to push jobs around after all. Where these vector units come in is for doing 'work units' of the kind of stuff vector processors are good at - 3D graphics, physics, compression and decompression. Basically, all the heavy lifting needed for games and media use. Those who have said it will be very hard to program are correct. If you ran a conventional program on it, you'd get the power of the supervising processor and not much more. You'd have to start looking for these 'work units' in your game, or whatever, and shooting them out to the vector units, along with 128k of all the data they'll ever need, and then pulling the results back when they're done. You'd have to cut the work units small enough so that they were done by the time they were due to hit the screen, or the speakers, which is why there is so little RAM in each vector unit - the jobs aren't expected individually to run for all that long. It is also why the article mentions realtime stuff built in there somewhere, so you know if you're falling behind before the output buffers run dry, and can do something about it. The more of these vector jobs you found, the faster you'd go. You are really looking however at designing a whole program with the philosophy of an OS writer - there will be no abstraction here, and to get the performance you'll have to work for it. The thing is, where has abstraction really got us, we developers? Code is easier to write, and we can write it more quickly and ambitiously than we used to be able to. The problem is that we seem to have 'spent' a whole lot of the new capacity the hardware industry gives us on this one aim, the result being that this 2GHz P4 runing XP in front of me seems about as responsive as a mac classic. Maybe it's time we gave up a bit of our precious abstraction, worked little harder and passed a bit more of that new capacity we get every year on to the users. After all, the first guy to write a game that really does this for this kind of architecture is going to look pretty good, and the guy who comes out the next week with a simple port of the PC version is not. It'll be interesting to see how the game programmers go at getting the best out of this thing. Rumours seem to indicate that they didn't really manage to 'get' the PS2 in this way...
Linux/OS X won't run directly on the hardware.
IBM has lots of money and time invested in virtualization of hardware. This stuff they been using in Mainframes for 20-30 years and have it good enough that you can run many OSes inside each other without much slowdown or overhead.
Think about what transmeta does with it's running x86 code on non-x86 hardware.
So you have a 'VM'-style parent OS that will take care of all the abstraction that a modern OS needs to be run. Sort of like a microkernel.
But instead of abstracting in hardware like you do with x86 or current PowerPC, you do it with the software 'VM' OS.
So with some modifications, theoreticly current PowerPC gcc compilers should be able to produce code that will run on a Cell. If the cell is setup.
Of course you have lots of optimizations and such to do, but that's what IBM's millions is for.
IBM already does this with it's mainframes. You have bare hardware. You have VM which runs on the hardware, and have proccessing partitions which get a slice of proccessing time and memory. In those proccessing partitions you run your Linux OS. Thats were IBM gets the claim that a single machine can run hundreds of Linux OSes.
The proccessing overhead for something like this would probably run 5-10 percent. You'd have much slower then x86 word proccessing and web browser, but anything to do with 3d acceleration or floating point caclulations ought to scream.
Fairly straightforward I would have thought; use lots of thick stone, big locks, and don't forget the bars on the window.
~~~~~ BigLig2? You mean there's another one of me?
Now look what you've gone & done...
hype hype hype, hype hype hype hype. hype hype hype hyiiiiiiiiipe hyiiiiipe hyiiipeeeeeeeeeee...ooooohhhhhh ya
What everyone seems to be missing, is that it is not necessary to paralellize your software, instead you could organize it into a sort of a pipeline, exactly like CPU is doing but, of couse, with much large blocks.
A DIVX codec on a four processing unit, for example, could use one cell to decode quarter of a frame, then handle results to the next cell, and submit next frame to the first one and so on. Decoding would go almost four times faster, wouldn't it?
Do you deep-fat fry those, or are they baked?
Okay, I got Linux installed. So where's the free beer everyone keeps talking about??
A long time ago, in a galaxy far away, CPU transistor budgets were measured in tens of thousands. You had _barely_ enough of them for either more registers _or_ a more complex decoder, but never both.
;) is simply that you just couldn't have _both_.
This begat RISC. A CISC computer had a more complex in struction set, but that barely left it with enough transistors for a couple of general purpose registers. A RISC computer, on the other hand, went by the mantra "never do in software what the compiler can do for you", so it had an over-simplified instruction set, but then it had enough transistors left for more registers.
In a sense each of the two was too expensive for _someone_. For CISC, registers were too expensive. For RISC the decoder was too expensive. In truth, both were expensive, and the grand unified theory
Fast-forward a bit, and registers are _not_ expensive for CISC any more. You do mention "what will prevent Intel/AMD to add a technology which could use multiple sets of registers", so the answer to that is: they already do. Both have huge register stacks they internally use for renaming. (E.g., when you swap EAX and EBX, the data isn't really copied, but the register from that huge stack which is currently EAX and the one which is currently EBX, get renamed to EBX and EAX respectively.)
Either way, they already have the Cell's 128 registers, and some even have 256 registers. You just don't see them from the outside. (Which is a pity, since compilers could really use them.)
Is there something to stop them from exposing more of those registers to the outside world? Nope. The AMD Athlon 64's (now also addopted by Intel) "64 bit extensions" already do just that: they double the number of general purpose registers visible to the program. That's largely what gives an Athlon 64 the speed boost when running 64 bit code: the extra registers.
Is there something to keep them from doubling or quadrupling them again? From a technical point of view, nothing whatsoever.
What's been keeping them so far is the software backward compatibility. A Pentium 4 still has to run code written for a 486. So whatever changes you do to their instruction set, they must leave the old pre-existing instructions unchanged. And there simply aren't bits left to add the new registers without changing the whole instruction set.
The migration to 64 bits has been such a good excuse to come up with a completely new instruction set, with more general-purpose registers. But such excuses are few and far between.
As for RISC... it died, it lost the battle. Yes, Apple and IBM still use it as a marketting buzzword, but that's it. There are _no_ RISC CPUs still being produced.
The G5 in Macs is simply a CISC with more registers and a better instruction set, but it's CISC nevertheless. It's internal structure is _not_ RISC, and AltiVec is _not_ RISC. They're in fact contrary to everything RISC stood for.
Ditto for Sun's UltraSparc.
(And everyone arguing that a G5 is RISC, has obviously never programmed a RISC CPU before. You had to take care of every single detail in software, because of the mantra "never do in hardware what the compiler can do for you." Even recovering from a pipeline overflow when an interrupt came, you had to do that in software.)
Hope this helps.
A polar bear is a cartesian bear after a coordinate transform.
If only Cell could do what is reguired of a desktop processor... Ie running tens of processes with total hundreds of threads (This WinXP PC currently has 61 Processes and 523 threads running, current memory usage at half a gigabyte) at the same time, with seamless virtual memory with disk swapping and process level memory protection. Cell can't do this, and if it were optimized for stuff like this, it'd have to lose most of the parallel vector processing capabilities, ie the entire point of the processor would vanish.
As it is, Cell has no place in the desktop, except perhaps as a co-processor, provided they convince MS to define DirectX support for utilizing Cell expansion card for vector processing etc.
I'm not holding my breath here...
"Cell can't do this"
I assume you are basing your opinion on absolutely no concrete knowledge, because it would be absolutely pathetic if you really have read up on the architecture and spouted such nonsense.
Maybe you (and others) haven't noticed, but the desktop PC is a deer in the headlights. Game machines will take over before you can say 'service contract'.
Pft. People have been saying this every time a new console generation is coming. When the upcoming Playstation 2 was hyped, some people were claiming it would easily emulate a PC at many times the speed of an x86. When it came, people couldn't take full advantage of the hardware. When they could some years later, PC hardware had surpassed it. Besides, people value the flexibility of a PC. In other words, bs then, bs now.
Being bitter is drinking poison and hoping someone else will die
Nicholas Blachford is an idiot. Do not read any of his articles. Just to give you the best of Nicholas, read his antigravity article and visit his web site:
;)
http://www.blachford.info/quantum/gravity.html
Also, look at the nose pictures of him
http://www.blachford.info/other/me.html
Seriously, the guy has burned most of his sane braincells.
For serious laugh, read his article series 'building the next generation' from osnews. I really got good laughs from that 4 part series.
Also, it didn't take long to spot a totally idiotic statement from todays slashdotted article:
> Parallel programming is usually complex but in this case the OS will look at the
> resources it has and distribute tasks accordingly, this process does not
> involve re-programming.
Here Nicholas misses the core problem of parallel programming. The program algorithms _always_ have to made parallel. The OS can't do it.
Keep hugging your peecee...
While the rest of the world is moving on to:
iPods, cellphones, Mac minis, TiVos...
Does the article's author read tarot cards or tea leaves too?
I'm under the impression that he does.
- Anon.
This part I agree with. His statements regarding abstraction are just flat out incorrect. Is this going to be programmed in assembly only? I think not...and if not there is significant abstraction involved. The thing that's closest to his point is that multiple *layers* of abstraction tend to add significant overhead. That doesn't mean that program-level abstractions do.
Once he got past the first 3 parts he started babbling. Linux on cell, so on, so fourth. If he just read his previous parts he should have hit himself on the head. The only type of linux this can run is mcLinux. There is no memory protection as such. So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
There is memory protection if the PU is in fact "something like a G5". IBM would have to be insane not include a MMU, and it has already stated that it's going to build workstations based on the Cell architecture.
All in all, interesting stuff...we'll see how it plays out. :-)
To continue on this, the power of a modern general purpose OS is the task switching. How long does it take to load and store the context of the vector processing units? Doing so requires moving their dedicated memory to main memory. This will take ages.
This, of course, depends on how many cells are in the box (with 8 vector units per cell) and how many tasks need vector units. The main purpose of the vector units in an interactive workstation will be multimedia processing. How many multimedia applications can you view at once? For me, the answer is one. The vector units may be useful for other things like engineering simulation and pattern matching, but once again how many different tasks using those features will be running at once? Plus if the processors are cheap enough to put 4 in a Playstation, one hopes the workstations will have 8 to 32 of them.
Overall, this is a design similar to Cray 1 initial design. Cray initial design smashed the IBM, DEC (and lesser fish) monopoly on big computing iron to bits. Unfortunately the next thing the people buying the Cray asked for was "can we share this resource between two people?". The answer was provided eventually, but by the time Cray could do all the nifty time sharing and memory management tricks necessary to do this its advantage was no longer phenomenal. And all people who could use Crays for single tasks with manual scheduling actually continued to use it that way. But it did not even dent the general purpose big iron market.
Two points. First, this is based on an already successful processor - the Power series. It already multitasks :-) and is used in a wide range of applications. Second, this will be a low-cost part. Crays were a super high-end system, which cost millions of dollars. Your analogy doesn't work.
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
Seeing the number of similar angry answers to all posts questioning the hype around the Cell, I'm thinking you are a very persistent but boring little troll behind them all.
Why are you so emotionally attached to a piece of hardware, what is there to be so angry about?
If the cell does all it has been promised, Linux (and probably Windows, though I don't care) will be ported to it. I will notice very little difference to my desktop experience. IBM makes the processor inside rather than AMD or Intel. Fine by me. This isn't a war you know.
Being bitter is drinking poison and hoping someone else will die
Does this mean that all our Beowulf-jokes are obsolete?
SIG: TAKE OFF EVERY 'CAPTAIN'!!
Well, this could very well be the next "most powerful PC in the world" campaign from Apple. :-)
That said, I think one of the killer apps for this could very well be excellent voice recognition. That alone could provide a giant advantage over existing architectures.
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
The story on CNN about Sadd*m of Iraq buying the PS2 for his WMD. What will it be for the PS3? Another lie from GW?
It always makes me laugh that today's cynics are always the ones who fell for the hype 5 years ago.
Just ignore the hype and look at the patent and make your mind up based on that. History is bunk.
First of all, the CELL concept is based on massive multithreading. As several other posters have said, software tools suck when it comes to parallel programming. Software engineers suck, too at parallelism.
Secondly, the 4 CELL processors of the PS3 will NOT give it a graphical edge over the PC. PS3 games will not be as impressive as the PC ones. The reason is that graphics will be handled by the NVIDIA GPU, not by the CELLs. But the PS3 will be stuck with a specific NVIDIA model, while PCs will be upgraded to the latest GPUs. Imagine the situation when Half Life 3 will be released: my PC will have a GPU that will be vastly more powerful than the one in PS3, simply because the PS3 one will be a generation behind. It is exactly the same situation now.
So what will this tremendous power be used for? Since the GPU will handle the rendering task, what will the vector units do (the vector units is where the power of the system is)? My 3400+ Athlon already handles HL2 physics at 80 FPS, without even blinking (and with simultaneous downloading and compiling!). Sound can be completely managed by a sound processor.
The only area that I see that CELLs are needed for a game is artificial intelligence; maybe by using neural networks in a game?
Finally, why aren't the GPUs networked like the CELLs? isn't it time to move on a scalable 3d graphics system, where the quality of the graphics is a function of the GPU power? The algorithms for that kind of graphics are already known and tested. adding more GPUs will make the graphics better, revealing details not visible without those more GPUs. That would be a motive for buying a next generation console.
What are you, some rabid fanboy?
Even a 1ghz processor has MORE than enough power for a media center PC. How much speed do you think it takes to play an MP3 or do DVR functionality? The only think holding back media center PCs is DRM, price, and interface.
Oh, and btw, you may not have noticed, but Sony's game consoles SUCK. By the time they're released Microsoft + Intel + Nvidia are already making games look twice as good. This has always been the case, and probably always will.
The only reason game consoles exist is because they're mass-produced last-generation hardware that can be had for cheaper than a *modern* PC.
doomed to repeat it.
I have not RTFA yet, but if programming this thing to make use of the multiple cells is anything like it's been on every other multi-processor system known to man... Sony's going to find developers producing less than the best software for the PS3.
Remember Sega Saturn? The PS1 ate it alive, though on paper the Saturn had more raw processing power. The problem was that developers couldn't find a good way to divide the load between processors. You had one doing practically nothing while the main game ran on the other.
This whole "cell" thing is going to make porting software from the PC a *bitch*, unless the developer is lazy and just uses one cell. Let the crappy ports begin!
-J
Yes, because the guy has a goofy website he must be ignored!! He likes linux, too! He must be a complete crackpot and because he is making a joke about his nose all of his ideas must be bullshit!
God, I love being a anal asshole. And my shit don't smell either!!!
IBM has said that one rack of Cell servers would have around 16 teraflops of performance. Using Apple's current Xserves it takes 40 racks to get 25 Tflop/s of performance. Apple states that one rack of Xserves currently can produce 630 gigaflops of performance. That means a single Cell server will be able to deliver 25 times the performance of a current Xserve.
No G5 in PowerBooks, in 2006 Apple will announce a whole new lineup of computers, all based on the Cell.
My Mythbox is unencumbered by proprietary crap. Can you say the same for your PS3?
Curb CO2 emissions: Kill yourself today!
Yes, that's right, I said it: Crackpot. The guy who wrote this "explanation" must be having a love affair with the IBM CEO. The computing he is talking about can occur in one of two situations: distributed computing or a "super computer" (which is the same thing). It is IMPOSSIBLE for a single processor to yield that much computing power. The problem is the speed of electricity WILL NOT KEEP UP. And the concept of distributed computing is going to run into the same issues as other networks in the past. This seriously sounds like some cocaine bunny that read papers from about 6 years ago, and possibly the "maybe" speech by Sony.
Article mentioning the core running at 4.6 GHz. I don't belive it. The Intel using every trick in the book to push up frequency, even compromising overall performance, but unable to push above 4 Ghz. Cell seems have as complex architecture as P and multiple cores on the chip and claim 4.7 Ghz ?
I for one quite like the idea of watching "Contact" on my TV while a PS3 sits in the background churning through a SETI@home [SETI] unit every 5 minutes.
/. the only thing anyone can think of is SETI!
Eevry time distributed computing is brought up on
My home cluster is setup for 3D rendering and (eventually) real-time video editing with Cinelerra.
...save us all from this idiot!
seriously, the gentleman that wrote this article is a worthy match for Bowie J. Poag (described properly at http://bowiepoag.is.batshitinsane.com/).
it is completely clear from the text that he knows shit about architecture of computer systems and even much less about parallel processing.
communication vs. processing ratio of parallel algorithms? never mind, we'll use the method at http://www.blachford.info/quantum/fastlight.html for communication between "cells". inherent serial nature of many algorithms? who cares! the OS will implicitly parallelize any algorithm, no matter, how serially designed...
anyway, take a stroll over the rest of his site: his knowledge of physics (demonstrated in the antigravity machine proposal) perfectly matches his knowledge of computers.
taking into account that the OS needs to take care of the in-between-cell networking and that Windsomething has crappy networking support and that IBM recently opened a lot of patents and that sony has previous experience in oficially running linux on their consoles..
:D ..that was probably just a rethorical question, we all know linux is always into anything!
DOES THAT MEAN LINUX MIGHT BE INTO SOMETHING HERE?
Isn't this kind of architecture extremely well-suited to being managed by a microkernel, with a level of abstraction provided between the details of keeping the processors busy and the dispatching of processes by the OS?
In this manner, the microkernel becomes a sort of "Cell compiler", breaking down the work presented to it and feeding it to however many processors are actually present in the hardware.
The number of "logical processors" seen by the OS might have only a faint resemblence to the actual number of Cell processors under the hood.
One question which was not addressed fully in the article was how do you compile/test programs for this thing. The answer is OpenMP. OpenMP is mulithreading API wich can hide parallelization from the user almoste completly. It's embarassingly easy to use - only one line of code is enouth to parallelize a loop. All threads creation/synchronisation remain hidden from user. It's extremly efficient too - I was never able to achime the same level of performance if duing multithreading myself.
>Here Nicholas misses the core problem of parallel
>programming. The program algorithms _always_ have
>to made parallel. The OS can't do it
I'm not defending his article, but I don't think this "rebuttle" is correct.
An OS could share work between processors automatically, if the environment it provides for software to run in is insulated enough from the hardware.
Think of a JVM. What if you made a JVM that could run across multiple processors? You wouldn't have to reengineer the programs that run in this JVM to take advantage of multiprocessing, the multi-processor JVM would pass on the benefits.
So what? So you'll just create sloppy code and the Cell will still outperform any other consumer CPU by a large factor.
Other than that, I think the chaining of APU's is a pretty new idea. The way the interconnect works is pretty novel too. There's some other nifty bits too; you might want to read the whole article.
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
open4free © fiction-science or penetration-pacience?
This is yet another extrapolation from Sony's patent of a few years ago. There's no new information, other than a lot of guessing about competition and marketshare and all that. Tech-wise, all of this is exactly the same information that's been used to write other articles...and there are are the same massive, glaring holes.
So what? So you'll just create sloppy code and the Cell will still outperform any other consumer CPU by a large factor.
Probably not. A lot of code is performance-limited by memory bandwidth and wouldn't run any faster no matter how fast you make the CPU and FPU.
Other than that, I think the chaining of APU's is a pretty new idea.
No; that's one of the standard ways of parallelizing things.
Pretty much this entire design space has been explored before; the "cell architecture" just happens to pick some point in the middle.
This was not a technology article. That was a "I for one, welcome our new cell processor overlords.." article.
I don't see anything in the cell arcitecure that would fundamentally make the same number of transistors at the same speed operate faster. I see lots of bottlenecks, IO overhead and wastet transistors. If there is some magical powerful thing that these can do SO much better than the current X86 instruction set and hardware, guess what, it'll adapt.
x86 adapted to RISC being "wildly faster" and, in the end, became better RISC than RISC was by translating more memory efficent X86 instruction onto a RISC backend. It adapted to SIMD (Single Instruciton, Multiple Data) efficiency issues by adding MMX/MMX2/SSD/SSD2 and 3DNow. It adapted to the reality of 64 bit address space and the need for more registers with the new X64 instruction set extensions. AMD and Intel could add cell hardware and instructions too if they offered anything special, which I highly doubt they will.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
This article seems blatantly inaccurate. Has anyone seen any of the other things on this guy's web page? UFOs, anti-gravity machines, faster-than-light travel, and so on. He appears to be a crackpot.
> An OS could share work between processors automatically, if the environment
> it provides for software to run in is insulated enough from the hardware.
> Think of a JVM. What if you made a JVM that could run across multiple processors?
> You wouldn't have to reengineer the programs that run in this JVM to take
> advantage of multiprocessing, the multi-processor JVM
> would pass on the benefits.
I assume you mean that the programmer doesn't write the program to be multithreaded (hence solving part of the problem), but the JVM makes a multithreaded version automatically.
If you wrote a JVM that moved execution between processors, it wouldn't work well because the JVM would need to do expensive data dependency calculations during program execution, and the synchronization between processors has its cost too. The research community has thought about this for decades without coming up with a great solution.
The only realistic model for the foreseeable future is to program it parallelizable, ie. the programmer solves the problem. This can work very well for very specific problems.
The program algorithms _always_ have to made parallel. The OS can't do it.
Never heard of OpenMosix, huh? Fact is, with the multi-tasking nature of most computers today, the OS can do it!
However, having 8 independant vector processors in the same package is like having a 9-core box in hand. For general purpose computing, it's not as useful as one would think, but for scientific and engineering computing as well as game uses, it's damned nifty. It'd also probably rock for other multimedia purposes. Imagine being able to crunch video streams with h.264 at a D1 or better resolution realtime as well as decode a similar stream. Instant Video over IP without any special hardware other than the CPU.
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
You are not totally right: even if your program is compiled in a binary using sequential intructions, the CPU usually can parallelise execution with additional HW (out of order execution, completion,etc..).
But it is true that within a sequential program, the available paralelism is very low, so I'd say that the author of the article is "wildly overoptimistic" over the performance of a cell CPU.
How vector units could help emulating x86 instructions (except for SSE of course) ????
I wouldn't call him an idiot for this article, but a 'believer': his "faith" in the cell is exagerated!!
Specialised computer are great with specialised SW but suck at general purpose computing or at emulating other (different) architecture..
Considering how much IBM has invested and banked on Java, wouldn't you think they would try to design a virtual machine that would take advantage of this architecture? I wouldn't expect the JIT to be able to parallelize (???) everything, but I would think it would know how to detect and translate certain segments of code which are easy to translate to a parallel architecture.
I don't know about you, but when I first heard about cell processors (and that fact that IBM was behind it), I immediately began speculating how IBM would exploit this architecture in their server market. This sounds like the sort of thing that will enable them to sell 256 processor monsters running AIX, DB2 and J2EE.
Even if designing to take advantage of this architecture is terribly difficult, just porting your webserver, database server, and transaction will solve the scalability issues for most Web/Client/Server applications.
IBM is buddying up to two (or three depending on your outlook) of the most powerfull forces in "entertainment computing" Sony and Microsoft. Apple and IBM (and Freescale) are plugging away at the Power/PowerPC and passing the fruits onto Microsoft in the next gerneration XBOX.
Sony and IBM (and Toshiba) are putting together some of the most esoteric hardware ever to be contained in a consumer computing device and in all cases IBM sits right in the middle of it. Is this cross-fire waiting to kill the giant, or is IBM counting on its size and contracts to protect it?
What happens when Apple says; "Hey! We want to play with CELL in our next generation hardware." Or Microsoft says; "Whats with giving Sony all the good hardware?"
Whats the potential of all these competetors sourcing silicon from IBM and ending up converging on a common, or nearly common architecture? Is this IBM carefully constructing a future monoply? Is it the beginning of the ability to run software from Microsoft, Sony, and Apple on a single (or nearly single) architecture?
Remember the IBM LongTrail CHRP/PREP development boards? A friend of mine who did some Web work for IBM was paid partialy with a dual processor PowerPC 604 development board that contained both an Apple ROM slot (used the beige G3 ROM) and full OpenFirmware (not the cripled version Apple implemented). Only the lack of well develped drivers for the peripheral hardware prevented it from running as a 100% compatible MacOS/WinNT/AIX (it was given to him with all three installed on the hard drive) machine. This sort of attempted hardware convergence is not a new route for IBM.
What do people out there think is going to happen? Are Sony and Microsoft far enough apart in thier long-term goals to keep working with IBM without stomping all over each other in the process? Where does Apple (or even Nintendo for that matter) fit into this? What rumors or conspiracies have people heard?
Wonder if they have also been working to optimize Linux for the CELL processor? I for one will be watching this very closely...
The NSA: The only part of the US government that actually listens.
The weird part was the programming. The scalar and vector units operated independently, much like the Cell does. There were some new synchronizing instructions such as "wait-for-checkpoint" that allowed out-of-sequence operation to proceed up to specified places in the program. Writing code in assembler was a nightmare, but Fujitsu developed a FORTRAN optimizing compiler (most vector work was done in FORTRAN back then) that automatically handled the optimal register configuration and the synchronization. The folks that wrote that compiler have my utmost respect.
BTW, Amdahl sold only a handful of these processors, mostly to European oil exploration companies to process their seismic echo data.You were 80% angel, 10% demon. The rest was hard to explain. - Over The Rhine
"Math in a song is good."-Linford
equals
x86 has won all of it is battles.
Come on people, it's "its," not "it's."
> The guy seems to understand hardware, but he does not understand shit about software.
It doesn't look like he really understands hardware either. Too many bold statements about the future- mostly hype and buzzwords. See his article series in osnews:
http://www.osnews.com/story.php?news_id=7676
Has anyone noticed that Nicholas has fixation for mentioning Amiga in any article he writes?
And AMD is just starting to make dedicated media chips.
One Cell to find them.
One Cell to bring them all,
and in darkness bind them.
Yes, the Cell processor will rule the world. After all, hasn't IBM been the Dark Overlord for decades now?
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
So this article claims that Cell forgets about abstraction and forces the programmer to go all the way down to the hardware. I cannot believe this to be true - even very good programmers have problems writing multi-threaded code, let alone vectorized multi-threaded code that should be suitable for a variable number of cores in assembly (which is actually an abstraction already).
If this is to be the software setup for their otherwise very nice architecture, I think it will fail miserably.
What they need is a vectorizing compiler at the least, for recognizing vector parallelism in algorithms. A vectorizing compiler alone is a very difficult thing to write. Basically it is all about data-dependency - no data-dependency between X variables undergoing operation Y means you can vectorize operation Y to work on these X variables simultaneously. That is theory. Now I have extended the Open64 compiler to do some small vectorizable kernels. Programmers can write a vectorizable loop in many ways and do it in very obscure ways, way beyond the recognition capabilities of a compiler. Writing a GOOD vectorizing compiler is a huge amount of work and even then it will not vectorize everything vectorizable you throw at it.
Next to that you'll want something that maps data-independent (!) tasks on different APU's using these software Cell's.
To me it sounds like a hellish thing to program. Does anyone know what Sony has in mind? I really don't buy the no-abstraction thing, it goes beyond the comprehension of all but the most genius programmers.
I'd hardly call it obscene, though I could certainly apply that word to some of the PS2 games. <G>.
However in a world where Bad now means Good, and many other adjectives are inverted in common useage, who know wtf the author actually means anymore.
For that matter, who knows what wtf means anymore.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Isupposeyoucouldhavewrittenitwithoutspacesandthatw ouldhavemadeithardertoreadbutImnotreallysuresinces uchalargereadwithnoparagraphswassounsightlythatIch osenottoevenlookIbetyouforgettochangehtmlformatted toplainoldtextandfurthermoredidn'tusethepreviewbut tonDon'tbeupsetifsomeoneflamesyou.
half way through the article i was like 'wow, i cant F*@kin' wait to get my hands on a ps3'...and then it struck me. PR, its just an STI-loving PR'ist. its amazing how subliminal and disturbingly effective some multinationals' advertising can be.(yes, i know
STI aint a multinash, its group)
and on a different note, im somewhat of a ps2 fan,
but i would NEVER let myself be fooled into believing that the warm and fuzzy feeling i once used to get as an materialistic teenager from hearing or seening "SONY" or "NIKE" somewhere was anything more than millions of dollars of PR weaving its evil work...
in addition, if the articles claims are true and STI did end up as the futures equivilent of wintel, dont think that it would be computer-geek nervana, because as i often remind people, the only goal of a corporation is to make money, and wether it be microsoft, sony or anyone else dominating the computer business, you will always be seeing ridiculous prices, por quality, and worst of all, YOU WILL ALWAYS BE SUBJECT TO ATTEMPTS TO TURN YOUR RIGHTS INTO A COMMODITY FOR THE SAKE OF PROFIT, ala microsoft
-distantbody
Will the cell-based GPU in the Powermac G6 be labelled "ATI" or "nVidia", or will the Wheel of Life return the GPU to the core again, and it'll be running on the "1.5-way" (G6 + Cell) Power PC daughtercard?
I'm not actually surprised that so-called journalists, especially the technical kind, get good salaries. If you look at the painful clowns running the show at ZDNet, and most technical publications for that matter, including such wonder rags, such as the Register, you know that the Agenda is almost the most important thing. The actual realities of the tech world be damned as long as you have someone passing you your monthly wad of cash.
And this story is no different.
As many have noted, Sony did exactly this kind of hyping the last time around when the PS2, with its emotion engine, was supposed to be the future of all things computing. As everyone knows, the PS2 was a real pain to code for, and the actual performance was not better than the PC's of the day. The Cell will undoubtedly suffer from the same problems when it comes to coding real applications. Concurrency and parrallelism do not an easier coding experience make.
I have no doubt that this thing will be good, but I absolutely doubt that it will have much or any effect on the x86 world of computing. The G4 processor, when it came out with the Altivec SIMD processsor, which was apparently better than SSE at the time didn't turn Apple into the next Microsoft overnight either, did it?
So, I expect that the x86 world will continue to thrive and that Apple will stick some of these Cell processors, having as they do a PPC 970, aka G5, in their core, in some of their machines and will make the usual wild RDF claims about how hot it is while it will be used by only a small fraction of actual Mac developers in reality, the Mac having to maintain backward compatibility only slightly less then the x86 world does.
In other words, it'll be business as usual.
Lots of people have been working on auto-parallelizing compilers. The idea is to take existing code that isn't parallel and during compile time (or run time) make those decisions intelligently and speed up processing. So far, there have been zero successes at it without explicit user directives to tell the compilers where good targets for parallelization are and how to do it (specifically creating threads and/or marking loops that can be parallelized).
:))
If you (or anyone) can solve this problem well, you'd be famous and wealthy beyond the dreams of avarice (assuming you patent it and license it out
open4free ©
the most "interesting" operating system i know of, Inferno, doesn't make use of a MMU. it's designed to be able to run on very minimal or embedded-style hardware (or as an application on top of another OS). it's got some really fascinating characteristics, and its handling of networking in particular is still way ahead of anything else out there, even eight years after its initial beta release. your comments about performance are still entirely appropriate, but there's plenty of very interesting - and even general purpose - things to be done with this.
i speak for myself and those who like what i say.
There's more paarallelism in heaven and earth than are dreamed of in your philosophy...
inherent serial nature of many algorithms
The Power PC 970 isn't a slouch when it comes to serial algorithms. But in a modern computer system, like Mac OS X, there's an awful lot of stuff that's deeply vectorisable. ALL the graphics, for example.
Powermac G5: dual G5 processors with Altivec + high-end ATI GPU, running a heavily 3d-accelerated GUI based on OpenGL.
What would a cell do for this?
Powermac G6: single G5/G6 processor with a cell coprocessor driving a dumb frame buffer, with OpenGL implemented as a stream of cells, Quartz Even-More-Extreme and real-time raytracing...
140 W - 10 W - 70 W - 20 W - 140 W - 140 W
open4free © (;
What gets me is how my computer is consistently outperformed by these gaming consoles. How can a $150 gaming console with 64MB of memory and some 1Ghz Celeron outperform my $2000 powerhouse in graphics processing? There are only a few possible explainations for this paradox:
1) The consoles are very well engineered. Or as made apparent by my computer science teacher, "Just the Apple computers, they have control of the hardware, so they only have to focus on what hardware they allow."
2) The drivers are very well engineered and specifically optimized. In the words of my computer science teacher, "They can optimize the crap out of them."
3) The underlying operating system is far less bloated.
I'm all for extendibility and customization when it comes to our workstations, but if our operating systems, drivers, and 4Ghz Athlon FX, ATI Radeon X850 PCI Express powerhouses are being outperformed by gaming consoles, something has GOT to be wrong...
Real programmers can write assembly code in any language. -- Larry Wall
It still is very bathery, but at least your eyes won't get lost in sameness of the text.
This is a very interesting architecture. It arises quite logically from the 'rediscovery' of vector processing for high throughput, low interdependence instruction streams that GPUs represent. For a long time, home computers didn't do the kinds of things vector processors were good at - they did complex, heterogeneous instruction streams, and the processors evolved to match. You can't run word on a Cray after all.
We got microops, RISC, superscalar architectures, multi-level burst filling caches, branch prediction, hyperthreading, all the things that roughly speaking make a stream of instructions that depend on each other's results a lot, don't repeat themselves much, and get information from all over memory go quickly. The thing is, the jobs vector computers are good at started cropping up in home computer loads, mainly for graphics and media type uses.
The industry started responding to this with dinkly little SIMD cores in essentially conventional processors, and then we got the GPU, which makes many more compromises to get you greater throughput. Now we have this thing, which is less graphics specific than a GPU but much more vector-heavy than a microprocessor. Most of the 'traditional PC' work, executing code, will be done in the 'supervising' processor I think, which is why it's so fat - you don't need a G5 just to push jobs around after all. Where these vector units come in is for doing 'work units' of the kind of stuff vector processors are good at - 3D graphics, physics, compression and decompression. Basically, all the heavy lifting needed for games and media use.
Those who have said it will be very hard to program are correct. If you ran a conventional program on it, you'd get the power of the supervising processor and not much more. You'd have to start looking for these 'work units' in your game, or whatever, and shooting them out to the vector units, along with 128k of all the data they'll ever need, and then pulling the results back when they're done. You'd have to cut the work units small enough so that they were done by the time they were due to hit the screen, or the speakers, which is why there is so little RAM in each vector unit - the jobs aren't expected individually to run for all that long. It is also why the article mentions realtime stuff built in there somewhere, so you know if you're falling behind before the output buffers run dry, and can do something about it.
The more of these vector jobs you found, the faster you'd go. You are really looking however at designing a whole program with the philosophy of an OS writer - there will be no abstraction here, and to get the performance you'll have to work for it. The thing is, where has abstraction really got us, we developers? Code is easier to write, and we can write it more quickly and ambitiously than we used to be able to. The problem is that we seem to have 'spent' a whole lot of the new capacity the hardware industry gives us on this one aim, the result being that this 2GHz P4 runing XP in front of me seems about as responsive as a mac classic. Maybe it's time we gave up a bit of our precious abstraction, worked little harder and passed a bit more of that new capacity we get every year on to the users. After all, the first guy to write a game that really does this for this kind of architecture is going to look pretty good, and the guy who comes out the next week with a simple port of the PC version is not.
It'll be interesting to see how the game programmers go at getting the best out of this thing. Rumours seem to indicate that they didn't really manage to 'get' the PS2 in this way...
On the contrary, it's a simpler architecture than the Pentium.
If they think you're crude, go technical.
If they think you're technical, go crude.
IBM is a very technical company,
so they decided to get as crude as possible.
There's no cache, so no cache coherency issues, and coarse-grained memory addressing. The most complex component is the Power-PC 970 core, and it's only a controller... it could easily be half-clocked at 2.3 GHz without significantly hurting the throughput of the vector units... and IBM already has 2.3 GHz 970s in production.
[Cell technology] is a revolutionary step forwards in technology and until now, the most serious threat to x86.
OK, untill now it was the most serious threat to the x86. What is the most serious threat now that we have more information on Cell?
Try saying that fast three times.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
I remember before the PS2 came out, they said it was going to have the power of a "supercomputer" and it would render lifelike images in real time. Looking at its graphics would be like watching a movie they said.
They even showed screenshots and videos which showed highly detailed images that looked fantastic. Everyone wanted to buy one.
Yet when the PS2 came out, those fantastic videos turned out to be nothing more than pre-rendered scenes that the PS2 was playing like a DVD player. It wasn't rendering those scenes in realtime. We were all tricked by clever marketers. The actual gameplay that the system was processing in realtime was ok, but nothing out of the ordinary. The PS2 turned out to be just another video game system surrounded by a lot of senseless hype.
I'm not holding my breath for the PS3 and it's Cell processor. It will not have the power of xxx high end computer processors, and it won't be revolutionary. It will just be another evolutionary step in videogames with improvements- but not groundbreaking improvements- in gameplay and graphics.
Let's quit the senseless hype and keep things in perspective.
For all those who keep saying RTFA to the posters who see this article for what it is: I have RTFA and there is certainly NOT enough information to draw ANY type of conclusions with regard to performance of this architecture, other than that the author's "interpretation" of the patent looks good on paper. Present the article to any other computer and systems engineer and they will tell you the same. I deeply dislike the fact that the Internet has become a breeding ground for speculation and FUD such as this. Can we please investigate sources before posting this stuff? This man has produced an articles about 1.) how to develop a gravity drive while taking into account all the problems presented in Star Trek 2.) Claims to have solved several problems relating to time travel, while admitting it's not possible 3.) freely admits he is not qualified to speak about anything at all and uses Issac Newton as his excuse for taking editorial license. I think everyone is entitled to an opinion, but that it shouldn't be published in a location where impressionable readers can be mislead, or propagated by people who don't know any better.
A little non-sense now and then is relished by the wisest men. -Willy Wonka
Imagine that. A network of PS3s processing SETI data in real-time. Wouldn't be surprised if that application isn't already under development.
And so what if the author is off be even an order of magnitude in his estimate of processing time as another poster suggests due to the lack of actual hardware to test this on yet. Within a couple years of launch there are going to be tens of millions of PS3s out there, all with network connectivity.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Not going to defend Nicholas, but it is certainly the case that better APIs that are highly parallel can be provided by the OS for graphics and AI operations. After all, that's where the bulk of processing time goes in modern games. A lot of work will still be required on the developer's part, but I can see a good chunk bein abstraced behind an API layer.
I don't know what kind of crack I was on, but I suspect it was decaf.
What rumors or conspiracies have people heard?
Only Bigfoot's orange juice and eggshell wonder diet.
So no Linux, no Windows past 2000, no MacOS past X, so on so fourth.
Uh, Windows before 2000 had memory protection... IIRC it just didn't do the copy-on-write that NT does for dlls, and it allowed you to unprotect shared kernel space and then modify it.
Win9x, which wasn't based on the NT kernel, would be effectively impossible to port to another architecture anyway. It wasn't designed with that in mind and it's so old now that it's not worthwhile.
The algorithms needed to make the Itanium go fast are still being figured out.
The algorithms needed to vectorise operations are well known and in wide use in every GPU in the world.
Sony made all kinds of outrageous claims before the PS2 came out.
? /d urable/1999/03/25/f-p13s2.shtml
0 /j apan.sony.html
http://csmonitor.com/cgi-bin/durableRedirect.pl
They said it was going to be able to render "Toy Story" like animation in real time.
http://www.time.com/time/asia/magazine/2000/032
"But it's souped up with extras: a microprocessor as powerful as a supercomputer and ports for hooking up cable TV, keyboard, mouse, digital-video camera and modem card. The possibilities are huge. "
From just looking at a simple top-level diagram of the Cell architecture, it is clearly shown that the Cell is much more powerful than any other processors currently available.
LOL. You can tell what the real-world performance of an unreleased chip is going to be, just by looking at a simple top-level diagram?
I can guarantee that the Cell is much more powerful than any AMD and Intel processors.
Hahaha. You can guarantee, huh? So the chip in a game system marketed towards gamers is going to deliver more power than high end CPU's that cost many times what the entire game system will cost?
I don't know what you're talking from, but you're not talking out of your mouth.
http://www.geek.com/news/geeknews/2003May/gee20030 528020156.htm
Called "Windows XP Professional x64 Edition" and the preview release works pretty well. I'm able to play Medieval:Total War on it quite happily.
Phil
I guess today is a passable day to die.
"The algorithms needed to vectorise operations are well known and in wide use in every GPU in the world."
Not quite. It's still hard to forsee what can be vectorized, and what can't. Especially in a dynamic language.
IMHO The vectorizing decisions should be pushed into hardware. Same with threading.
I'm not defending or bashing anyone here. But I think the way this is possible is thourgh JIT. Compilation is a parallelisable problem. As you execute code on one cpu you can compile code on another.
The main difficulty would in be the fact that the APU and the x86 architectures are completly different. This might impact performance. Or perhaps it's possible to use the APUs for compilation and the main CPU for execution, or maybe a mixture both approaches.
Anyway, it's way to early make any conclusions.
Ars Technica covers the 'why' of the design.
"I'm willing to believe that a 4.6 GHz chip with 8 ALUs and high bandwidth memory would be fast, but even in bulk, there's no way they can afford to put 4 of them in a sub-$500 game console."
Since you've been reading the PR. You already know they intend for the Cell processor to be used in more than just a $500 game console.
That and it was a good way to get in a good poke at MS. They still haven't forgiven them for OS/2.
Plus now they have a gateway into China.
The Cell has 4 FPUs times 8 PEs at 4.6GHz, for a theoretical 147 GFLOPS with unknown power consumption. The ClearSpeed chip has 2 FPUs times 96 PEs at 250MHz, for a theoretical 48 GFLOPS with 5W power consumption.
Combine 3 or 4 ClearSpeed chips and you'll get the performance of a Cell unit with no doubt lower power consumption. Of course by the time the Cell sees the light of day in 18 months, it is quite likely that ClearSpeed will have updated their chips to match.
aQazaQa
> Never heard of OpenMosix, huh?
To take advantage of mosix, one has to program task into _separate_ processes. Hence the programmer can only parallelize on process level (very coarse granularity), and by doing that the _programmer_ solves the problem. So this isn't a good counter example.
The article is full of inaccuracies (at least 50% of his "facts" are wrong). Ironically his conclusion section is probably correct. (Have to post anon due to NDAs :-)
The article pitches this as a cell vs PC fight where the cell will be 10x faster than the PC we use today and therefore we will all purchase cells. Uh-huh.
When will these kind of guys get it? It has taken years of time and zillions of hours to get to where we are today. The new 10x faster whatever pales into insignificance if it can't do what we are doing right now. The cell approach obviously has a lot of potential but it will never get mainstream adoption unless it integrates into what we already have. If the promise of the technology is actually realized (and that's a big if), it will reach the PC platform only as some sort of transparent add-on to the current platform architechture. Then it will be gradually integrated until new PCs have a sticker on the front that says 'includes CELL' or whatever.
We are software limited, not hardware limited, and anything new had better work great with everything we are doing now. Otherwise, that new standalone CELL thingy will be just another piece of really fast hardware that everyone ignores because it can't do anything.
Well I'm going to get this guy Slashdotted.
The COSA OS. Try matching that with the CELL?
Cell provides a machine - but they do it in hardware, the equivalent of Java's virtual machine is the Cells physical hardware. If I was to write Cell code on OS X the exact same Cell code would run on Windows, Linux or Zeta because in all cases it is the hardware Cells which execute it.
So the apps will run on anything- you merely need to rewite every OS ever made!
George II -- Spreading Freedom and American values, one bomb at a time.
But once you've compiled it, you don't need to compile it again, therefore your JIT optimization is over within a fraction of a second.
+1 Insightful, -1 Troll. What can I say, I'm an Insightful Troll.
Or, rather, from the sound of it, programming it is going to be like listening to German heavy metal. As in, really, really hard...
Uh, yea, it's important to keep in mind that this is all pretty well theory, I could find some other interesting patents that describe in detail that flying car we're all supposed to have, along with thousands of other undeliverable advances. On the other hand, there will be a PS3, and Sony, Toshiba and IBM are spending billions to make sure it's a super-fast successful system, so... we'll see.
was the ps2 the supercomputer it was said to be...?
Actually, uh, it's how old and does what for what price and has been how successful ? I'm not sure I'd scoff at the PS2, no matter how much marketing drones over-hype products.
after seeying the arquitecture and the teoric numbers, the Cell is like an arquitecture that I was thinking about in a OS.
PU -> DMAC -> VARIOS APUs
EXOKernel -> Security Policy -> Hurd-like servers
how does the function
software_difficulty( number of CPUs )
look like?
Is 100CPUs more difficult then 10CPU
10CPU is certainly more difficult than 1 CPU
You would think that we would have learned our lesson from the PS2 hype, but then again, perhaps Sony has learned there lesson as well. I think one thing is clear. The PS3 will easily outperform the XBox 2 and Nintendo's "revolution". As long as the programmers can cope with the new paradigm.
I hear you. (and totally agree). I'm comparing encumberance b/w Windows MCE and the PS3. Ultimately both will look encumbered as hell next to opensource linux-based DVR software.
(Building my Myth TV box now...)
------ The best brain training is now totally free : )
What is the power consumption? Will Sony be able to build the PS3 without a fan, or will customers be able to use the PS3 to heat their whole house in the dead of winter?
I've abandoned my search for truth; now I'm just looking for some useful delusions.
AMD64
given the modular design of Cell, couldn't an x86->Cell emulator module be built?
Hardware emulation, on a hardware level.
The module would talk the x86 arch to the outside, but talk whever Cell uses on the inside.
Would allow me to run every x86 app I have from winders to suse, but I wouldn't have to port anything to a new arch.
might be a little slower, but adding more modules could make the cell cluster look like multiple CPU's to an OS.
unless I'm missing something fundemental about Cell's design.
Two words
PSX Three
These companies have been screwing the pooch on this one for a long time and they aren't going to stop anytime soon.
Bittorrent is it, bittorrent is video on demand.
That cable in your wall will be digital tcp/ip.
I have a Sony Media Center PC (Running Windows XP Media Center) and it is a complete piece of shit. (Since it is always recording and playing back video at the same time, it appears to drops frames like crazy. Also, Doom 3 will only run with a decent frame rate at 640x480. All this for only $2400!). I'm unclear whether this is Sony's or Microsoft's fault (probably a combination of both), but it is abundantly clear that Windows Media Center is not yet ready for prime time. Hmm... does the Linux PVR software support the latest Sony PC hardware?
I've abandoned my search for truth; now I'm just looking for some useful delusions.
Actually, a lot of what killed the dreamcast was that it is trivial to pirate games for it.
Take off every sig. For great justice.
Wow, this article is amazing, Joseph Goebbels would be proud. Why does this rediculous sort of hype happen just with playstation?
The current processing bottleneck, and the reason for caches in the first place, is the bandwidth between the processor and the memory. A "normal" memory bus cannot keep up. This is why you see so many attempts to speed this particular part of the system up. There is RAMBUS, DDR, even HyperLink.
What these guys are trying to do is move the processor to the memory rather than the inverse. Having fast expensive caches near the processor is an attempt to get the memory closer to the proc. What has been happening of late is that lots and lots of on-chip transistors have been spent on the cache. The Cell architecture is a step in the other direction. They want to spend those transistors on processors instead of memory.
At the limit of this idea you would see something like a super-granulated architecture with a processor on each memory chip. Imagine a PC with 32/64/whatever cell processors *and* no classic "processor socket" on the motherboard, just some DIMM-like "cell" slots. Each proc would have exclusive access to the memory on its own chip and all would communicate via some sort of bus or fabric of links. So, instead of one mega proc with tens of millions of transistors(perhaps half would be cache) at 4GHz with a 400MHz x 32, 64, 128, whatever bit width memory bus you'd have maybe 64, 128, 256? simple ARM-like procs at 400MHz each with something like 400MBs or more available memory bandwidth per proc.
Of course the extreme limit would be to have millions of 1 bit processors, but I don't think that anyone is proposing that just yet. Things do get more and more neuron-like as you approach this limit, interesting eh?
Good judgement comes from experience, and experience comes from bad judgement.
- W. Wriston, former Citibank CEO
Besides the real world hardware issues that will likely come up, the author of the article didn't really cover the "politics" that will be involved in trying to make Cell replace x86 processors.
Most companies, unless some radically new programming techniques are made, are going to have a hard time switching from traditional software programming to one based on the Cell's parallel computing. Even he said it's a pain programming software that will correctly execute in a parallel environment. The only ones likely to switch over with any swiftness will be the game and high end computer industry. But I doubt someone would need a 30 GHz processor to use MS Word or other such programs. That's where most of the commercial PC business is located, using "work" programs that don't require blazing fast processors. And even with most of the computers being in the business sector, Cell is unlikely to (initially if even for a while) make a dent in the server and database markets.
Switching back to the hardware front, what about all the data being fed to the processor to compute? It comes straight from storage devices, whether they be flash, optical, magnetic, whatever. The data transfer rates for these are not close to utilizing the full potential of the Cell processor and won't be for a while. On top of that, and being my specialty, because Cell can handle so much so fast, they will require huge storage mediums along with huge data transfer rates. This will be needed due to the sheer size of new software written for Cell, which will be quite large simply because Cell can handle it quickly. Terabyte/in^2 storage space, what I am working on using nanoparticles and what will be needed for Cell applications, is at least several years away and will cost quite a bit when they come out. Most people won't want to spring for these hard drives or other storage mediums needed to fully use Cell. So until then, Cell might actually be more than we need or can handle. It's going to require leaps and bounds in other areas of the computer industry and that may be a little harder to do.
But then again, these huge leaps and bounds may come about tomorrow so who know what will happen. Science and Technology are advancing so rapidly that its anybody's guess as to what will happen. Sometimes all it takes to rule the world is just being at the right place at the right time with the right tools. Just ask Bill Gates.
DarkGeek
sheeesh... If you would stop for a moment to think about it, you'd notice that this architecture is vastly different from your average PS2 arch. The "caches" are actually one per APU which means that these APUs have a kind of access to "locked-down" cache memory - this is basically what most modern signal-processing CPUs do. Take a look at some DSP code specifically tailored for the PPC440 cpu, it explicitly takes advantage of the ability to lock the CPU's caches. I'm sure the main processing unit WILL have L2 cache, though.
Parallel processing is nothing new but this chip sounds like Professor Dally's custom stream processor, the Merrimac He lectured congress on the need of a vector/streaming processor supercomputer because the current supercomputers are inefficient. His Stanford website. Description of cell sounds just like his processor. Even the drawings. The only differene is the location of memory. He makes the point that memory should accessed fast and installed on the chip. That's whats different between the two.
From TFA,
The Cell could be Apple's nemesis or their saviour, they are the obvious candidate company to use the Cell. It's perfect for them as it will accelerate all the applications their primary customer base uses and whatever core it uses the the PU will be PowerPC compatible...
The Core Image technology due to appear in OS X "Tiger" already uses GPUs (Graphics Processor Units) for things other than 3D computations and this same technology could be retargeted at the Cell's APUs. Perhaps that's why it was there in the first place...
If other companies use Cell to produce computers there is no obvious consumer OS to use, with OS X Apple have - for the second time - the chance to become the new Microsoft...
PC manufacturers don't really care which components they use or OS they run, they just want to sell PCs. If Apple was to "think different" on OS X licensing and get hardware manufacturers using Cells perhaps they could turn Microsoft's clone army against their masters. I'm sure many companies would be only too happy to get released from Microsoft's iron grip. This is especially so if Apple was to undercut them, which they could do easily given the 400% + margins Microsoft makes on their OS.
Licensing OS X wouldn't necessarily destroy Apple's hardware business, there'll always be a market for cooler high end systems [Alien]...
A Cell based system running OS X could be nearly as cheap (depending on the price Apple want to charge for OS X) but with Cell's sheer power it will exceed the power of even the most powerful PCs. This system could sell like hot cakes and if it's sufficiently low cost it could be used to sell into the low cost markets which PC makers are now beginning to exploit. There is a huge opportunity for Apple here, I think they'll be stark raving mad not to take it - because if they don't someone else will - Microsoft already have PowerPC experience with the Xbox2 OS...
Cell will have a performance advantage over the PC and will be able to use the PC's advantages as well. With Apple's help it could also run what is arguably the best OS on the market today, at a low price point. The new Mac mini already looks like it's going to sell like hot cakes, imagine what it could do equipped with a Cell...
That that is is that that that that is not is not.
> shots from the N64 were CG renders that still aren't matched in the most powerful PC or console games today.
X-Box is the one that was caught making up screenshots and passing them as real time. We found out on a message board about one of these pictures, and the official X-Box site had to clarify that those were indeed pre-rendered shots (the game was Amped).
http://sellmic.com/index_old.html
- sigs are for wimps.
This kind of crap comes out every year or two, the x86 is the target for everything to beat, or at least in the eyes of the /. crowd. For some reason people have these biased ideas about the x86 sucking, or some particular arch with a small benefit in one area or another being able to leverage that benefit to completely take over a whole market. Lets see, x86 killers.. the itanium, the ppc, the cellphone (mostly ARM), java, the PS2, going back further the whole group of chips labeled as "RISC" even though after a generation or two non of the RISC chips were RISC. etc...
The x86 is popular because of a number of reasons, each of which may be more important for a particular customer base. Its more a function of being a generalist. It tends to be "good enough" for most general purpose computing needs and its cheap enough, and well understood. Just because some arch comes along that is lower power, faster, higher bandwidth etc.. doesn't mean that it will be able to leverage that one advantage enough to upset the massive installed base of x86's.
Anyway, in this case, having to rewrite all your OS and software doesn't bode well for the Cell processors in becomming the x86 killer, at least not for the next 5-10 years or so.
You can tell what the real-world performance of an unreleased chip is going to be, just by looking at a simple top-level diagram?
So, instead of providing reasons why my claims are false, you go on to question valid points with sarcasm without providing a single bit of evidence to support your claim? I am not the one not talking out of my mouth.
The write-up is not PR. It is written by someone who figured out the architecture from a patent. Tell me, what processor on the market today has a main processor and 8 arithmetic units attached in a parallel fahsion. In a home system, most computing power goes into audio, and 2D and 3D visualization, much of which are parallelizable. Tell me which current Intel and AMD processor has the capability of pipelining a video encoding/decoding the same way that Cell does. Tell me which Intel and AMD processor today can produce results for as many vector problems as the Cell can.
IBM, you know-- the people that make RISC chips-- would tend to disagree with you:
f /p roducts/PowerPC_970_and_970FX_Microprocessors
r e& q=define:RISC
http://www-306.ibm.com/chips/techlib/techlib.ns
Oh, and Google:
http://www.google.com/search?hl=en&lr=&oi=defmo
Taking a definition of a word from 10 to 15 years ago and applying it to today's language is, well, ignoring the developments of the past decade. The term RISC simply means that the instructions used are fewer and less complex. It also means that design goals are to be most efficient per clock cycle, as opposed to brute force. In addition, they require fewer transistors, which makes them run cooler and be cheaper to produce. It's not the narrow definition you make it out to be.
I suggest you buy the G5 you've been having dreams about and finally address that itch.
"Politicians find new names for institutions which under old names have become odious to the people."
This really seems like the proverbial system on a chip. If you don't need a PCI bus, or AGP bus, or any of the equivalents, the system chipset becomes a rather small bit of silicon. AMD has shown us how to put the memory controller on the CPU die. So why not go the extra step and combine what's left of the chipset with the CPU? And then, just to make things interesting, why not take the computational units out of the graphics card, and put them where they can be used more easily. Consider shader 4.0 Apps using the video card as another (rather specialized) processor. But instead of putting all those vector units in a video subsystem, they've been put on the CPU where *any* program could make better use of them.
What this seems like to me is simply that [TSI] has taken all of the processing power which would have been spread across several chips, and put it all on *one* chip. With a memory controller and a stripped down system chipset. A modern PC is simply a bunch of coprocessors all in one box. It makes sense to take all those separate coprocessors, put them on one chip, and standardize the programming interface. It's like being able to use the processing power of the dsp on your sound card to make your graphics a bit faster.
I think the "GPU" nVidia is designing probably handles little more than the video buffer and a few generalized, non computationally intensive duties.
Oh, and 85 degrees celcius with a heatsink is *excellent* thermal performance. If you don't think so, try running your processor with just a heatsink, fanless.
Magic. IBM (with the selloff of their harddrive division) now has no need to grind up pixies for the magic "pixie" dust they use to make harddrives. So they have all these pixies and nothing to use them in - so they'll be used to cool the chips :)
It is all a big misundertanding between Sony and IBM.
IBM told Sony it was going to "Sell" its PC busines. Sony has been telling everyone about IBM's "Cell" PC ever since.
Seriously though: For all we know, the PS3 may have four cells. (One CPU core, and three "APU" cells.) One APU for the boobs, one APU for animated low polygon count "hair", and one for inane dialog.
Maybe the new splice() based pipes in Linux can be used to move data between APUs.
Ever notice how much in common the Gamecube and the Mac Mini have in common?
In other words, media data and processing algorithms will be behind an impenetrable DRM hardware wall. "Cell programs" (the little vectorizable data manipulators) will be trade secretes. Outsiders that want to program something new will only be able to string together DRM approved cells. For example, there might be an approved MPG6 cell that will report meta-data found initially in a MPG6 stream but Rights Management interests will never permit any cell that exports all of the MPG6 data.
Why does the recommended single chip PE (processing element) include 8 DPUs? My guess is that a certified library of Cell Programs will not allow anything to be sent off chip that is not strongly encrypted. Thus one might have an 8 DPU chip where 3 are used to decrypt the input, 2 to do the actual processing, and 3 are used to encrypt the output. This off-chip disadvantage is a strong reason for putting multiple PUs and their 8 DPUs on one chip - If intercommunication between Cells cannot be detected externally then there is no need for the encryption/description stuff.
PC software is written around certain assumptions, for example: moving data from one place to another is slow, so you load everything into RAM before you need to work with it, then load all your graphics data into video RAM before you need to display it, etc.
The PS2, and I suspect most other consoles, doesn't work that way. It has a few processors running in parallel, and MASSIVE bandwidth, and you don't need a lot of temporary storage because it's easy to just pass data from one chip to another as you work with it.
Porting PC software to the PS2 without dramatically restructuring it will result in obvious problems, since the PC software relies on a lot of temporary storage. Porting PS2 software to the PC will also result in problems, not because the PC is a poor design, but because it's just different from the PS2.
Visual IRC: Fast. Powerful. Free.
This contributed to the lack of developer support for the full feature set of the saturn. Scant few games actually took advantage of the dual CPUs.
Okay, one by one:
The CS monitor says it exactly like it is: PIII gets 2 gigaflops, and the PS2 get's 6.2 gigaflops. Consider the former is just for the CPU, and the latter is for the whole system (CPU + GPU), this isn't outrageous. The "twice as fast as a supercomputer" stuff was editorializing on the part of the writer.
In the Time article, Sony doesn't exaggerate at all. The PS2 really can sustain ~70M polygons per second when they are simply shaded. The "microprocessor as powerful as a supercomputer" stuff is again, editorializing on the part of the writer.
Show me actual technical literature that shows Sony exaggerating about the PS2. They take favorable numbers (eg: flat-shaded polygons instead of textured polygons), but they are for the most part verifiable. Note that the presentations about Cell aren't the same sort of thing as fluffed-up magazine articles. They are professional presentations, stuff written by geeks for people who know what a DMA controller is. If they are making stuff up for this, they'll be discredited by their peers. I doubt Sony and IBM engineers would take those risks.
A deep unwavering belief is a sure sign you're missing something...
My 2 year old athlonXP 2000+ with a 9600 pro and 256MB of RAM runs NFSU2 much prettier than an xbox/PS2 can, with a perfectly playable framerate, all while playing MP3s in the background, and running that big, bloated monstrosity known as windows.
This is my sig. There are many like it, but this one is mine.
Boy did this guy sucker you in. Cell is similar to this concept, but changed from the early patent filings.
> They take favorable numbers (eg: flat-shaded polygons instead of textured polygons), but they are for the most part verifiable
It's worth noting that a whole lot of Japanese games use only shaded polygons. Even now, most console games use less textures than a PC game, and use algorithmic tricks instead. It took developers years to get their heads around the PS2's bizarre VPU design, but it turned out to be damn powerful in the end (it manages to stay reasonably competitive with the X-Box even though it's a full generation older and has half the RAM).
Hopefully the API will have a better translation job done this time. CLITIndex anyone?
Windows NT was written as the successor to DEC's VMS operating system. That's why the MIPS base. Add 1 to (each of) VMS, and you get WNT.
Bit of history for the kids.
Cell doesnt do any of that. Cell is at this stage a few patents and a dozen or so press releases.
Tell me which current press release has the capability of pipelining a video encoding/decoding the same way that an intel/amd processor does?
The fact that 99% of consumer applications other than games dont require additional CPU power.
hurts my eyes.
I predict the next step in software engineering for power hungry applications will be to move away from the OO model and instead design data processing components that execute in parallel. It would be something like a hybrid between VHDL and C++.
That's FreeBSD, bubba. And NeXT started off on 68K, not x86 - they got to x86 late in their lifecycle.
Maybe you should of read the article instead of trying to impress us with your rad knowledge oh master. The Linux kernel and GCC will be compiled, no doubt about it.
I predict the next step in software engineering for power hungry applications will be to move away from the OO model and instead design data processing components that execute in parallel.
Data processing components that execute in parallel...that perhaps communicate by passing messages? Congratulations, you have just defined OO programming.
The write-up is not PR. It is written by someone who figured out the architecture from a patent. Tell me, what processor on the market today has a main processor and 8 arithmetic units attached in a parallel fahsion. In a home system, most computing power goes into audio, and 2D and 3D visualization, much of which are parallelizable. Tell me which current Intel and AMD processor has the capability of pipelining a video encoding/decoding the same way that Cell does. Tell me which Intel and AMD processor today can produce results for as many vector problems as the Cell can.
The mistake you are making is that you're comparing a processor that only exists on paper and press releases to an actual, physically present processor that has already been designed, tested, marketed, and used by millions of people.
I could write a technical writeup for an imaginary processor that has xxxxx terraflops, only uses 1 watt of electricity, and only costs $5 to make. It's easy to do that stuff on paper because you don't have to deal with the technical hurdles and limitations that reality imposes. Talk is cheap. It's another thing entirely to actually produce the product and have it do all the things you said it would.
What if in the prototype stage they discovered that the working silicon doesn't perform as the design was expected to? What if they can't ramp up the clock speed on it due to unexpected reasons? What if they can make it but it ends up costing them more to produce than anticipated? These are things you have to work out before you can sell an actual product. But a write-up doesn't have to deal with these problems.
That's why I say that talk is cheap. It's easier to talk about something than to do it. Wait until the cell processor does come out and see how it performs. Promising the stars is very easy. Delivering the goods isn't quite so easy...