Slashdot Mirror


Ars Technica's Hannibal on IBM's Cell

endersdouble writes "Ars Technica's Jon "Hannibal" Stokes, known for his many articles on CPU technology, has posted a new article on IBM's new Cell processor. This one is the first part of a series, and covers the processor's approach to caching and control logic. Good read."

25 of 449 comments (clear)

  1. Workstation? by jericho4.0 · · Score: 4, Interesting
    From this site and others..

    " Last fall, IBM and Sony said they were developing a workstation based on Cell chips, which is the first product IBM will ship based on Cell."

    Regardless if this is the first product shipped or not, a workstation is coming. I can't see it running anything but linux. Given the mass market targeting of the cell, I hope Sony makes a strong go at grabbing the market with cheap hardware, rather than trying to milk the high-end content creation market first.

    --
    "A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
    1. Re:Workstation? by Anonymous Coward · · Score: 1, Interesting

      I can't see it running anything but linux.

      Why not?

      IBM has previously said their chips had the potential to run right past Intel's in a couple years' time, to the point where the fastest way to run x86 would be on an IBM (non-x86) chip emulating x86 -- in the 10-20GHz range while Intel is still at 4-5GHz. The Cell architecture sounds ... pretty similar to what they said they'd be shipping this year.

      The people who aren't running Linux today aren't doing it because Linux isn't fast enough. They're doing it because they can't run their Windows apps on it, or because it's too hard to install, or because their scanner isn't supported yet. I know 0 people in the world who are holding off on Linux due to performance.

      So sure, they could run Linux on it. People probably will. But how many new computer architectures since MS-DOS have succeeded without MS-DOS compatibility?

    2. Re:Workstation? by node+3 · · Score: 2, Interesting

      I can't see it running anything but linux.

      OS X is another strong possibility. Sony's President was recently on stage with Steve Jobs at the Macworld Expo, hinting at working with Apple in the future. A recent slashdot story linked to an article which states that 3 PC manufacturers have been begging Apple to license OS X to them. I'll bet Sony was one of them, and IBM would also be a logical suitor.

      Since OS X is essentially NEXTSTEP 6, and the Cell workstations would be great for science or 3d, OS X is a likely candidate OS. The Cell is also going to be in TVs, so I could easily imagine the OS to be OS X (with a shell more logical for TVs, of course, not the Aqua GUI).

      This would be a bold move for Apple, Sony, and IBM, and it just seems so right.

      I agree, though, that they'll also run Linux--it runs on pretty much everything!

  2. Why do I have the sneaking suspicion no? by Anonymous Coward · · Score: 1, Interesting

    Beautiful as it is as a gaming CPU, I still am having trouble seeing how this thing would work in a PC of any sort. The only customers the Cell has so far-- Sony-- are talking about the stream processing units being directly coded for by the programmer. But general PC programmers still haven't accepted Altivec that well despite it being available, easy to use and useful, how are they going to react to "rewrite your program to use SPEs"?. Meanwhile if the programmer does directly code for the SPEs then that's all well and good in a video game system, but it's yet to be explained to me how on earth the SPE works in a time-sharing operating system. Specifically, who gets to use it, when, and what they do with the onboard SPE memory during a context switch? These are not trivial issues.

  3. More info in these slides by Namarrgon · · Score: 4, Interesting
    Scroll down a bit here, there's some more tasty tidbits.

    e.g. 234 M transistors (!) That's why I don't think this will be replacing the G5 any time soon. The die size (at the current prototype's 90nm) is over 200 mm2.

    It'll have to get a fair bit smaller/cheaper before the PS3 can use it without major subsidies, and I don't know why they think general consumer devices will want it. God knows how much power it dissipates with all 8 SPEs clocking over at 4 GHz...

    --
    Why would anyone engrave "Elbereth"?
    1. Re:More info in these slides by Anonymous Coward · · Score: 1, Interesting

      I can see Sony using effectively 'defective' cores in the PS3.
      The rumours are it may contain as many as 4 Cell modules - I doubt they will be '8 way' Cells.

      The die size means that defects are to be expected (even when it moves from 90nm to 65nm), so don't be suprised if we see '6 way', '4 way' and even '2 way' Cell modules making it out to market (all fabbed as '8 way' but shipped with whatever works.

      Thats the only way I can see Sony/IBM/Toshiba making money on selling games consoles, cell phones and DVD players with this tech inside it.

      '8 way' for Workstations and Grids
      '6 or 4 way' for PS3 and lower workstations
      '2 way' for home entertainment systems.

      Apparently the '8 way' core is up to 10 times faster than the latest PC CPU (That puts it in the 50-60 GigaFLOPS range) ... I'm sure the PS3 would cope with a few '4 way' cores and a cell phone might just scrape by with a lower clocked '2 way' Cell ...

  4. iCell? by mpesce · · Score: 2, Interesting

    Although the article (which is quite clear) indicates that the AltiVec architecture is closer to G4 than G5, won't the speed increase of having 8 fully-parallel processors (9 if you count the main CPU) more than make up for the issues associated with the loss of the G5's advanced features? It seems to me that this is a natural for Apple - it will give them a 5x - 10x performance boost over anything that's on the drawing boards over at Intel.

    Even so, I doubt we'd see Cell-based Macs until at least 2007 - but wouldn't it be great to run PS3 games on your Mac? (As if that'll ever happen.) But then again, given the Cell architecture, your PS3 could use your Mac to make its games run faster! A whole new reason to have an XServe-based supercomputer...

  5. How do I code this thing?? by MagikSlinger · · Score: 3, Interesting

    The one thing I don't understand is how I would code for this thing. As best as I understand it, I now have some instructions for controlling the cache (or LAM, whatever) which sounds cool, but are there any details yet of how I'd write code for this? I'm also disappointed that the article didn't explain how one would use their SIMD instructions if they aren't using any of the existing standards. So I load my vectors with the cache control and ask the processors to ever so kindly add them?

    Anybody out there with experience on this architecture or even attended the presentation itself can give us mere coders details? Preferably a website.

    --
    The bitter lessons of a veteran coder: http://bitterprogrammer.blogspot.com
    1. Re:How do I code this thing?? by fuzzbrain · · Score: 4, Interesting
      I don't have much experience or knowledge but there was an interesting article the other week about how the next revolution in programming languages will be a turn towards concurrency:

      "Starting today, the performance lunch isn't free any more. Sure, there will continue to be generally applicable performance gains that everyone can pick up, thanks mainly to cache size improvements. But if you want your application to benefit from the continued exponential throughput advances in new processors, it will need to be a well-written concurrent (usually multithreaded) application. And that's easier said than done, because not all problems are inherently parallelizable and because concurrent programming is hard."


      Obviously, it's not clear whether this is directly relevant to cell processors, but I think it's at least of passing interest. It's also worth considering whether concurrency-oriented languages like Erlang and Oz could become more important with these sorts of processors (not for games but possibly for scientific work).
      See also the discussion of this article on Lambda.
  6. I understand by JeffTL · · Score: 1, Interesting

    that it runs at 30 watts, about like a Pentium M. And it's 64-bit. Can we say....

    Dare I say....

    Oh the Hell....

    PowerBook G5!

  7. Not useful for scientific computing by renoX · · Score: 4, Interesting

    What I find interesting is that the vector processor are restricted to single precision floating point calculations.
    This isn't terribly useful for scientific computations (there is the same problem with the GPU): currently the IEEE is working on a standard for 128bit precision floating point calculations!

    Of course for 3D, video and sound, 32bit precision is good enough and *if* programmers (a big if) manage to overcome the pain of 'parallel programming' then it could be a big success.

    1. Re:Not useful for scientific computing by Anonymous Coward · · Score: 1, Interesting

      Heah, tell me about it! It's really frustrating to see all these "I can do parallel ops on 4*32bit floats" that can't just to the same for 2*64bit doubles[*] That's the one place where SSE2 really spanks AltiVec - and for me it's the only one that really matters. Oh well ...

      [*] splitting a double into two floats isn't really the same - and not guaranteed to work well in all cases anyway. Besides, with 3rd party libs you don't even have the option.

  8. Golden oppourtunity for L4/Hurd by The_Dougster · · Score: 2, Interesting
    This arch is still a baby and this would be a great time for L4/Hurd to latch onto this processor. There is already a L4 PowerPC/64 port in some kind of development stage, and the very first platform is likely to be a PS/3 with somewhat fixed hardware specs. Marcus et. al. were discussing today something and they mentioned that there is nobody working on the driver interface for L4/Hurd yet.

    Hurd might be an interesting candidate for running on Cell because of the highly threaded design. Hurd servers might be able to swap in and out of cells as they require cycles. It seems a good match; i.e. L4 runs in the main core, and various translators and other processes run on the cells. If a cell could be programmed to run the filesystem, for instance, it would totally free up the core for other business.

    Because the PS/3 will have a highly fixed hardware set, implementing a minimal driver set might be feasible given enough reverse-engineering effort.

    I'm not saying that L4/Hurd will kick the nuts off of Linux on an Opteron, I'm just noting that it might be pretty cool to experiment with Hurd on Cell technology. The L4/Hurd team is real close to getting the last peices in place to compile Mach based Hurd under L4, and if you ever tried Debian GNU/Hurd, you know its pretty near feature-complete and a pretty neat system to run. The next task for L4/Hurd is a driver infrastructure, and it might be wise to look at what Cell is bringing to the table before it gets too far along. Know what I mean.

    --
    Clickety Click ...
  9. Digital Rights Management by wakejagr · · Score: 5, Interesting

    Another article on the Cell design at http://www.theregister.co.uk/2005/02/03/cell_analy sis_part_two/ seems to indicate that there is some sort of DRM built in.

    The Cell is designed to make sure media, or third party programs, stay exactly where the owner of the media or program thinks they should stay. While most microprocessor designers agonize about how to make memory accesses as fast as possible, the Cell designers have erected several (four, we count) barriers to ensure memory accesses are as slow and cumbersome as possible - if need be.

    Hannibal doesn't say anything about this (that I noticed) - anyone have more info?

    --
    Don't save Windows XP! http://www.petitiononline.com/jjw1xp/petition.html
    1. Re:Digital Rights Management by xenocide2 · · Score: 4, Interesting

      Sounds like an enourmous misinterpretation of the concept of caching. As a multimedia programmer on the Cell, its likely you'll have sole jurisdiction over where stuff goes on your processor. Think of it like programmable cache management. Usually that's pretty stupid, because you want to write things back for longevity, but media is more transient--streams and whatnot. Barriers within that context would be cache levels.

      But perhaps they've got some technical details (enough that they can count distinct features) that I can't find with a basic google search on the subject. It would certainly be out of Sony's previous style, though I understand they recently pulled their heads out of their collective asses and discovered that they were selling a loose metaphor of cars and crowbars at the same time, and came out with a public apology for sucking.

      --
      I Browse at +4 Flamebait

      Open Source Sysadmin

    2. Re:Digital Rights Management by Anonymous Coward · · Score: 1, Interesting

      Blachford and his parent company have equal reasons to cheer for DRM or against it (and, perhaps, for the 'right' or 'wrong' combination being in Cell, for that matter; Genesi are tied to PowerPC, but seem to be falling into Freescale's lap now that the former Motorola division is ready to sell to desktops that aren't Apple), so it's really hard to say.

      What I get from Hannibal's article is that GCC is going to be available, and will support the full ability of the underlying hardware unencumbered. This is a good thing, it means the possibility of fully Free/free/Nth-party implementations is there. They might not want to crow about certain internal aspects (beyond what they quietly feed into GCC's mcpu opts), vs. forwards-compatibility and all, but this makes it a perfectly usable platform for anyone crazy enough to roll his own... Not the wacky, totally locked-down black box I might've expected to see from Sony. (Not that a black box would make sense, but as others have said, that company's had an agenda.)

      Now, the not-so-good news. The GCC support noted, if it exists, only gives you the chance to use the chip to its full extent. All this other magic that's been talked about -- the grid computing, network-abstracted multiprocessing, buy enough Playstation 3s and the Internetweb becomes sentient and challenges you to Tic-Tac-Toe -- is going to live somewhere in the "OS" or "SDK," and the licensing on that code is unlikely to be free. That's "okay," you might not want your Linux running on top of a pile of proprietary crud anyway (though would Apple mind, if they can find a way to balance Darwin atop it, or afford a license to get a jumpstart on creating a more native library form?) ... but those are the buzzwords that have the kids jumping up and down in their cribs, I fully expect the software exists, and it means the "Free" world will be at a disadvantage until the GNU equivalents show up. There's Beowulf and MOSIX and so on, yes, but depending how far they've gotten, we might be seeing a platform for home computing tasks that 'does it all for you' -- while the OSes we prefer to run will still be struggling to accellerate MPEG decode or whatever across the units in a single package.

      That's cool, maybe you don't want all your processing tasks to start blowing around with nothing but crypto (if you're lucky) to secure them -- lord knows I have misgivings, myself -- but a decent subset of people won't mind, and it could have some serious impact in content production as well as gaming. I could be way off, Sony could screw it up or single-purpose it as much as they screwed up the Playstation 2, but if there's a way to bung these things together and they come acting 'computery' out of the box, this could do for things like 3D animation what Postscript did to dot-matrix. (While remembering that initial Postscript devices were pricey and slow, sure.)

      No wonder a guy who owns both a PowerPC computer company and an animation studio is interested...

  10. Eliminating Instruction Window by ndogg · · Score: 2, Interesting

    This RAM functions in the role of the L1 cache, but the fact that it is under the explicit control of the programmer means that it can be simpler than an L1 cache. The burden of managing the cache has been moved into software, with the result that the cache design has been greatly simplified. There is no tag RAM to search on each access, no prefetch, and none of the other overhead that accompanies a normal L1 cache. The SPEs also move the burden of branch prediction and code scheduling into software, much like a VLIW design.

    Why? The reason for the instruction window was to simplify software development.

    Of course, I like to play devil's advocate with myself, so I'll answer that question.

    The purpose of the Cell processor is to enhance home appliances, which have a greater reliance upon low-latency than they do on precision, accuracy, and performane bandwidth. Thus, one can very safely say that the Cell processor will likely have little purpose in scientific calculations.

    --
    // file: mice.h
    #include "frickin_lasers.h"
  11. Re:Cellection? by Screaming+Lunatic · · Score: 2, Interesting
    Autovectorization is planned for GCC 4.0.

    gcc autovectorization page.

  12. A proposal for Apple by Anonymous Coward · · Score: 4, Interesting

    A proposal for Apple

    I don't have an account, but this is an honest idea.

    Why doesn't Apple include a Playstation 2 support card into their Macintosh line?

    Problem: The OSX platform has almost no games. I own several macs, I love my macs, and I sincerely enjoy OSX. But it has no games, and that will never get better, especially as simpler games migrate to the web and the complex ones bail for the console market. The PC gaming market has essentially peaked.

    Solution: Embed (or include as a BTO option) a PS2 chipset to a Macintosh. Run the generated display straight through to the graphical overlay plane. Done.

    Everything works. The controllers are trivially converted to use USB. The DVD drive is already there. The display is already there. The USB and Firewire is already there. The harddrive is already there. The "memory cards" are already there.

    Reason: The Macintosh game library explodes instantly to encompass something like 3,000 PS1 and PS2 games. With no need for emulation, the games are guaranteed to work out of the box and provide the Apple ease of use everyone loves. Sony increases their marketshare, Apple gets a viable expanding game library, and users get a vastly better gaming experience on OSX for maybe $40 of parts and engineering.

    Why won't this work?

    1. Re:A proposal for Apple by Anonymous Coward · · Score: 1, Interesting

      Those are all firmware issues, and trivial ones at that.

      The thing that would make this work is that the PS2 uses essentially industry-standard parts and interfaces, including said DVD drives, USB ports, and Firewire ports.

      The PStwo includes every bit of this, AND a small case, AND shipping, AND accessories, AND a warranty, AND a manual, AND cables, AND a controller AND *retails* for only $150. There's no reason a moderately modified chipset with updated firmware to work on a Mac motherboard couldn't provide an entire PS2 core for under $100.

      And converting the internal video format to a digital signal would be a minor implementation detail. (if this needs to be done at all)

      Anyway, the point is that Apple would benefit by gaining a library of 3,000 quality games, hundreds of which are current A-list titles that equal or surpass the current PC games on the market in quality. That you can just buy a PS2 is besides the point.

      The point is that your mac now plays Playstation games out of the box, turning a serious platform liability (OMG macs don't have games lol) into an asset (all the best PC ports PLUS all the PS2 library).

    2. Re:A proposal for Apple by Anonymous Coward · · Score: 1, Interesting

      That wouldn't necessarily be the only target market. Let me frame it this way.

      Macworld Expo 2006:

      iPlay is all new in iLife '06. Starting today, all Macs become more playful. iPlay is like iTunes for your games.

      Download all new games just like you would download a song in iTunes Music Store. One click shopping, no install programs, no configuration, no hassles. Comes with a variety of classic games you'll enjoy, and there are thousands more Apple-approved games online.

      And Just One More Thing...

      Starting today, all Macintoshes now play your Sony Playstation discs, as easy as this.... *demo of Katamari Damacy 3 *

      iPlay. Not just for kids.


      And that's it. One white PS2 controller with USB connections free in the box of every Mac Mini 2 or iMac, and a pack-in game sampler. The entire PS1 and PS2 library plays in it's own window upon disc insertion and works just like DVD Player.

      And oh, lo, what's this? You can now write, compile, and ship Mac-specific games STRAIGHT TO THE CONSUMER that are stubbed into the PS2 chip onboard so that you can have a RELIABLE, NO BULLSHIT, NO PLATFORM TESTING GAMING ENVIRONMENT ON EVERY CONSUMER MACINTOSH. It doesn't have to be the latest and greatest, it just has to work. Because believe me, playing in the high-end gaming market is not feasible for Apple anymore. Period. DirectX9 and DirectX10 on x86-32 is the development platform of choice, and cross porting is going to continue to be a charity and a rarity, assuming PC gaming has any real future beyond Platinum FPSes and MMOs. And this can work as a download since you won't have to require an entire DVD worth of data with a PS2-platform Solitaire game or something. Just wrap a 6MB app in an PS2 compatible "ISO" or something and let the firmware and launcher app take care of the rest.

      This breathes all new life into the PS2 platform, provides a reliable gaming platform for Macs IN HARDWARE (no backporting, no incompatibilites with 2007's video card, no sobbing game vendors screaming for expensive new API support in OSX), and it fills in the one most glaring gap in the Apple User Experience.

      A merger between the Playstation and Apple platform is the only possible way I can think of to get simple, one-click, no-bullshit, no crash, downloadable, any Macintosh, pay-as-you-go-able games that work seamlessly, beautifully, and with a real gaming controller. And you could do all this for maybe $30 per unit.

      Emulation is not the answer, pleading with vendors to port from the PC is not the answer, creating second party development teams is a hugely expensive answer, and letting the game environment on the Mac continue to rot is not an answer I'd like to see.

      World of Warcraft is not going to carry sales up the Apple hardware line, no matter how good it is.

      Microsoft is rolling this online casual-gaming concept into XBox Live 3. If Apple wants this, this is the way to go. And they can get the biggest and best title library in existence to go with it.

      Just add a PS2 chip.

  13. Re:Mistake by TheNetAvenger · · Score: 5, Interesting

    A budget-class PC laptop of that time might have been about 900 MHz to 1.1 GHz. I wouldn't consider such a laptop anything near useable. They tended to have poor quality sound systems that bottlenecked the processor and atrociously short battery times. The ibook was legendary for its excellent battery performance

    Get off what you 'assume', assumption is just intuition for idiots.

    We have test 200mhz laptops with 80mb of ram 5gb hard drives, released 1997 all running WindowsXP Professional (yes even the themes turned on) and they benchmark faster than they did when they shipped with Windows 95.

    Secondly, they can do full 30fps video as long as it is uncompressed AVI or even WMA 9. QuickTime (MPEG4), MPEG2, and real stutter horribly on video playback unfortunately.

    As for battery, don't know, these laptops hold for 3hrs with a single charge, and yes techs are REQUIRED and have no problems using them daily in test scenarios.

    Now if you really want to compare laptops to laptops, why don't I show you our 900mhz AMD Compaq laptops, they have JBL sound systems in them, and there isn't a single feature the cannot perform with the exception of running a T&L based video game, as the integrated video doesn't handle it, oh wait, the 900mhz PowerBook video didn't support such features either. (BTW, This is not to say that there are not several 900-1000mhz class laptops that have upper end video features), I am just using what we have in our test labs for comparison.

    The 900mhz laptop has a DVD/CDRW, came out late 2000 early 2001 (trying to remember if we got them before holidays or not). They do full software DVD decoding with less than 20% CPU utilization and pretty much do anything fairly fast that we through at them. We even have a beta version of Windows 2003 server running on one with 256mb of RAM. (Yes we are always pushing the limits, but it works as fast as the WindowsXP pro version of the machine sitting next to it.)

    Now off my rant... Macs truly are great, and the PowerBooks of the time were great, but that DOES NOT MEAN they were the BEST, WILL ALWAYS BE THE BEST, or you should be complacent listening to Apple tell you what you are getting is the best when it might not be. It is time for us as MAC users to stand up and DEMAND that technology becomes as much a part of what a MAC is as the EASE of USE in the Interface.

    The time is now, we need to STOP accepting what they tell us and give us and force them to truly give us the LATEST technological concepts, not just the above average concepts when compared to the PC world. These are Macs, they SHOULD BE BETTER. IT shouldn't even be subjected to a debate they should be so far advanced a debate should not be possible. PERIOD.

    Sadly, it just isn't true now, and has not been for many years. OSX has giving the Mac world some credibility backing OS technology, but not Apple needs to take Macs to the next level.

    Even if my comment inspires one Mac user to say hey Apple, we want better, then maybe we all can be the symbolic person with the hammer from their 1984 video and WAKE THEM UP this time.

  14. Re:Export controls? by Anonymous Coward · · Score: 1, Interesting

    If you remember when the G4 came out Apple advertized that the military didn't want that thing leaving the country.

    Oh really?

    Sorry, don't believe you. I contend that either of the following possibilities is more likely :

    1. You are getting confused with the whole PS2s being bought up by Saddam thing (also total PR bullshit).

    2. It was just made up by deluded Mac zealots that actually believed that the G4 was a "Super Computer".

    Feel free to prove me wrong - preferably with actual evidence.

  15. Re:Interesting by divisionbyzero · · Score: 2, Interesting

    Well, it certainly might seem that he is being a hypocrite. See:

    "In another part of the article, Blachford claims that the cell processing units have no "cache." Instead, they each have a "local memory" that fetches data from main memory in 1024-bit blocks. Well, that's sort of like saying that an iMac doesn't have a "monitor," but it does have a surface on which visual output is displayed. In other words, the Cell "local memories," which are roughly analogous to the vector units' "scratchpad RAM" on the PS2's Emotion Engine, function as caches for the PUs. What has thrown the author for a loop is that they're small, and the fact that they're tied to each cellular processing unit means that they don't function in the memory heirarchy in the exact same way that an L1 does in a traditional processor design. They do, however, cache things. But maybe I'm being nitpicky with this."

    and

    "Finally, to address something more specific to the Cell architecture itself, on page 1 we find this claim:

    It has been speculated that the vector units are the same as the AltiVec units found in the PowerPC G4 and G5 processors. I consider this highly unlikely as there are several differences. Firstly the number of registers is 128 instead of AltiVec's 32, secondly the APUs use a local memory whereas AltiVec does not, thirdly Altivec is an add-on to the existing PowerPC instruction set and operates as part of a PowerPC processor, the APUs are completely independent processors.

    The author appears to be confusing an instruction set with an implementation. The 128-register detail is a problem, because, as the author correctly points out, conventional Altivec has only 32 vector registers. So obviously it's a given that Cell won't be using straight-up Altivec. But it's entirely possible that it'll use some kind of 128-register derivative of the Altivec instruction set. The fact that the individual processing units have a local cache has little to do with whether or not the PUs themselves implement some hypothetical Altivec derivative. Finally, the statement, "Altivec is an add-on to the existing PowerPC instruction set," is correct, but the rest of that sentence--"and operates as part of a PowerPC processor"--doesn't make a whole lot of sense to me in this context. Altivec is an ISA extension that is implemented in different ways on different PowerPC processors. The Cell processor's PUs could very well implement a hypothetical 128-register Altivec2 ISA extension, or they could implement some other SIMD ISA extension. The fact that SIMD code, written to whatever ISA, is farmed out to individual PUs has nothing to do with it. (If what I just said confuses you, you might check out this article.) "

    compared to

    "The main differences between an individual SPE and an early RISC machine are twofold. First, and most obvious, is the fact that the Cell SPE is geared for single-precision SIMD computation. Most of its arithmetic instructions operate on 128-bit vectors of four 32-bit elements. So the execution core is packed with vector ALUs, instead of the traditional fixed-point ALUs. The second difference, and this is perhaps the most important, is that the L1 cache has been replaced by 256K of locally addressable memory. The SPE's ISA, which is not VMX/Altivec-derivative (more on this below), includes instructions for using the DMA controller to move data between main memory and local storage. The end result is that each SPE is like a very small vector computer, with its own "CPU" and RAM."

    But if you read closely you will see that Blachford, to generalize, was "right" (e.g. local memory and no AltiVec on SPE) for the wrong reasons, and even then some of the info was factually incorrect (e.g. SPE fetches blocks of 1024 bits). I do think that Hannibal was too hard on the guy (probably because of his completely unsubstantied claims about performance) and I think Hannibal should've cut Blachford some slack based on the source material that Blachford had available to him (although Blachford's

  16. Re:As a total Cell/PS2-coding n00b... by Anonymous Coward · · Score: 1, Interesting

    As well as
    for(i=0;in;i++){
    sendpacketWithThisData(&i);
    }

    And since the compiler isn't smart enough to know which functions call be parallelized and which ones can't, or which ones are reentrant (multiple threads can call it at once), it would assume that the calls must be done one at a time.

    Also, one thing noone seems to catch on to is that you have to consider the amount of time it takes to load code section and data onto one of the SPEs, then kick it off, detect it being finished and read the result. How would this compare to doing the operation rather than farming it off to another execution unit? For performing simple tasks, the SPEs would be slower, and if the task your handing off is complicated and doesn't lend well to fitting in 284k (?) of memory then you have 8 times the cache misses for 8 processors = not much speed increases. I'm deeply curious about this processor but there are soo many questions that need to be answered.

    One rumor I heard is that you wouldn't be able to write code for the PPC core. Instead, it would act like the os coordinating threads and performing scheduling, and each SPE would be operating on one of the threads running on the system at any given time. If you write multithreaded apps you would literally get almost 8x performance. But this is highly dubious to me because your cache miss rate would make it crawl.