Slashdot Mirror


User: adam31

adam31's activity in the archive.

Stories
0
Comments
205
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 205

  1. Re:Ran simulations, not code on The Potential of Science With the Cell Processor · · Score: 1
    Sycraft-fu, I understand your skepticism, and I think it's a unfortunate that they didn't publish physical timings. Your post has 3 main points: 1) Their simulations don't factor in something that will account for additional slow-down, 2) Their compilers aren't adapted, and that will contribute to slowdown. 3) Realistic improvements are incremental.

    1) The Cell is actually a pretty simple architecture. Once memory is transfered to SPE local store, performance is deterministic within a fraction of a %. The big question mark is the performance of the DMAC and XDR, in both bandwidth and latency. I feel like, because the paper consistently assumes 25.6 GB/s (theoretical max memory bandwidth), that will be the cause of unexpected slow-down. Achievable should be somewhere 18-24, and that will only affect operations that are memory-bound. They assume 1000 cycles of latency, which should be sufficient in any case.

    2) The fact is that their simulations were run using machine code generated from a real compiler. The language of the source code is irrelevant. More logical is to argue that other compilers have exhausted their potential, while Cell compilers are still in their youth. More straightforwardly, you can argue that a typical compiler has three main deficiencies: it doesn't appreciate the cost of spilling to the stack, it is a slave to correctness in the face of any potential pointer aliasing, and a compiler's nature is scalar processing. The three answers as far as Cell is concerned are: 128 vector registers minimize spilling, all aliasing can be hidden by the restrict keyword + local intrinsic variables, and SPEs are vector-only, with integer and FP sharing a register file. Never has an architecture been more ripe for compiler optimization.

    3) I don't know that this is more than an incremental step, at least as far as high performance computing technology is concerned. It is fundamentally different from AMD64, for example, but so in a way that addresses major concerns. 128 Registers per SPE, 25.6 GB/s bus, 256 kb L1-speed memory per processor, all at minimal power consumption... plus they can be linked together on a 35 GB/s bus. The key is that if I ask you to point to the major architectural bottleneck... could you?

    I remember many years ago, I was listening to a talk given by a Pixar tech guy. He articulated that one of the primary benchmarks they used in how to construct their renderfarm was flops per meter cubed (based on performance - heap dissipation of rack space). The Cell isn't quite revolutionary, but it will make many companies re-evaluate their high-performance needs.

  2. Re:What about the compiler? on The Potential of Science With the Cell Processor · · Score: 3, Informative
    I am also an experienced assembly programmer, and I too shared your mistrust of the compiler. However, I started SPE programming several months ago and I promise you that the compiler can work magic with intrinsics now. Knowledge of assembly is still helpful, because you need to have in mind what you want the compiler to generate... make sure it sees enough independent execution clumps that it can cover latencies and fill both the integer pipe and FP pipe, understand SoA vs AoS, etc. But you get to write with real variable names, not worry about scheduling/pairing of individual instructions or loop unrolling issues.

    Some of my best VU routines that I spent a couple weeks hand-optimizing, I re-wrote with SPE intrinsics in an afternoon. After some initial time figuring out exactly how the compiler likes to see things, it was a total breeze. My VU code ran in 700 usec while my SPE code ran in 30 usec (@ ~1.3 IPC! Good work, compiler).

    The real worry now is becoming DMA-bound. For example, assuming you're running all 8 SPEs full-bore, and you write as much data as you read. At 25.6 GB/s, you get 3.2 GB/s per SPE, so 1.6 GB/s in each direction (assuming perfect bus utilization), so @3.2 GHz, that's 0.5 Bytes/cycle. So, for a 16-byte register, you need to execute 32 instructions minimum or you're DMA-bound!

    Food for thought.

  3. Re:Ease of Programming? on The Potential of Science With the Cell Processor · · Score: 1
    I suspect that the Cell's design is not as elegant (from a programmer's POV) as it could have been, only because it was not designed with an elegant software model in mind.

    It's possible that this is the case, however IBM is actively working on compiler technology to abstract the complexity of an unshared memory architecture from developers whose goal isn't to squeeze the processor:

    When compiling SPE code, the compiler identifies data references in system memory that have not been optimized by using explicit DMA transfers and inserts code to invoke the software-cache mechanism before each such reference.

    So for developers who want performance, the architecture is ideal. 2 Megs of L1-speed memory, a 25 GB/s bus servicing 8 processors each with 128 128-bit registers. And for the rest, it's still a high-performance programmer-friendly development environment.

    Your point is not going unnoticed by IBM.

  4. Re:What about the compiler? on The Potential of Science With the Cell Processor · · Score: 3, Informative
    Actually bullshit.

    Actually, it's not bullshit. Simple C intrinsics code is the way to go to program the Cell... there's just no need for hand-optimized asm. Intrinsics has a poor rep on x86 because SSE sucks. 8 registers. A source operand must be modified on each instr, no MADD, MSUB, etc.

    But Cell has 128 registers and a full set of vector instructions. There's no danger of stack spills. As long as the compiler doesn't freak out about aliasing (which is easy), and it can inline everything, and you present it enough independent execution streams at once... the SPE compiler writes really, really nice code.

    The thing that does need to be hand-optimized still is the memory transfer. DMA can be overlapped with execution, but it has to be done explicitly. In fact, algorithms typically need to be designed from the start so that accesses are predictable and coherent and fit within ~180kb. (Generally, someone seeking performance would do this step long before asm code on any platform anyway...)

  5. Re:It's probably NOT fake... on Sony Fakes Blu-Ray Demo? · · Score: 3, Interesting
    At least Slashdot didn't pick up the inquirer goofed story about Sony running GT:HD on PCs at the E3 conference. Apparently it was based on an image suggesting that only rack-mounted servers were to be found on the floor.

    Too bad those rack-mounts are PS3 devkits! With all the faked Sony bashing, it's clear why no one pays attention when they do do something crooked.

  6. Re:Anyone who thinks Sony didn't copy the Wii? on Controller Comparison - PlayStation 3 vs. Wii · · Score: 1
    You can't have it both ways...

    Either the controller felt laggy because the controller sucks or because WarHawk only had a few weeks to work on it... It makes no sense to say that *both* are true.

    IMO, you're right. Sony got the controller working very late, and WarHawk only had a few weeks to work on it... and despite that many reviews have said that the controller felt quite natural. Of course, in the end it's a matter of taste.

  7. Re:Tricks microsoft on Microsoft Sides With Nintendo Against Sony · · Score: 1
    This is classic!

    Microsoft is scared shitless of the Wii. They know there's no way in hell that Sony won't sell out of PS3s this holiday, so they're not even in the equation. Microsoft is looking at the fact they're going head-to-head with Nintendo for the budget crowd.

    Of course, after the dust settles, supply will catch up with demand and the real battle begins. Expect a price drop soon after the holidays from Sony.

  8. Re:It IS a last-minute gimmick... on Warhawk and The Dualshake Controller · · Score: 1
    It can't be that last minute for it to be reviewed so well by everyone with hands-on sessions.

    My theory is that they originally tried to put rumble, motion-sensing, and extra batteries into the thing, prompting the "Batarang" design because it all took so much room. So they were fighting a three-front war-- The lawsuit they are losing with rumble, widespread criticism of the Batarang, and extra weight to support it all. So they decided to win all three wars at once by ditching rumble. Now all that is left is a motion-sensing, lightweight, familiar controller.

    As for the "last-minute" WarHawk comments, it's most likely that Sony wasn't sure if they could get it all working in the new design-spec in time for E3. The decision to show it off with WarHawk probably was last-minute, and that demo would surely have been cut if the WarHawk team hadn't pulled through.

    As for drawing attention away from Wii. Sony wants Nintendo to kick Microsoft's ass. Sony hopes that every single person who decides that the PS3 is too expensive will buy a Wii... I promise. And then later buy a PS3 when they're cheaper to buy and cheaper to produce.

  9. Re:Sony overconfident? on Sony's Conference The Day After · · Score: 1
    The reality is that Sony will easily blow through those 6 million units. $599 just isn't a steep enough price barrier.

    The real question is what happens to those people that are turned away in the long lines? Do they pick up an Xbox or a Wii? Sony is what it is... the real battle is between Microsoft and Nintendo this holiday season.

    As for the Sony's final market share, that will become apparent as early adopters let the world know if the system is worth it... or worth it at what price. People run around now orgasming judgements, like they matter, and I guess that's just E3.

  10. Re:Sadly, not a lotta FPU hardware. on Octopiler to Ease Use of Cell Processor · · Score: 1
    64-bit double precision floats are [somehow?] implemented in software and bring the chip to its knees.

    "implemented in software" (you mean microcoded) : False.

    Doubles are processed at 25-30 GFlops, vs 256 for SP (a magnitude of difference). However, I've also read that the current incarnation of the Cell is geared toward the PS3, and later versions meant for scientific computing will have much improved double support.

    But Flops aren't so crucial. What is crucial is memory bandwidth and scalability, 2 factors that have dictated the design of Cell from the ground up. The processing speed of a 1,000 node computer will be memory-bound for most real problems.

  11. Re:The biggest danger of broadband on We Don't Need No Stinkin' Broadband · · Score: 1
    I did go back... to slow (30 kB/s) "broadband DSL" at $25/month. It all happened when Comcast came to town and jacked up broadband and cable, while offering a *package* deal.

    So, having ditched real broadband and cable, I just don't miss it. That's as simple as it can be said. For a basic package, it'd be an extra $1000 per year. Honestly, the money isn't even that important, but anything that lures me to watch more TV and use the Internet even more would not make me happier.

    So I'm curious... now that Napster's dead, and pr0n jokes aside, what is the point of a Big Burly Broadband?

  12. Re:It'll grow into itself. on PlayStation 3 May Play Too Much · · Score: 1
    Sony is doing a lot of TALKING RIGHT NOW. People are justifiably nervous.

    Actually, the opposite is true. Sony hasn't said a single thing other than "we re-iterate, Spring 2006" since around October. The talk is all from the media circus, format war hype, images leaked from a page in a magazine. Guess, guess, guess, what is the silent beast thinking?

    Microsoft is the one that went around talking... Halo 3! HD-DVD Peripheral! More Shipments Soon! No Hardware Malfunctions!

    As they say, walk softly and carry a big stick.

  13. Re:Just learned something new on IBM's Radical Cell Processor · · Score: 1
    The Cell won't be terribly well suited for AI either, so you probably don't have much to look forward to. Game AI is typically notoriously branch-heavy and often tends to be mostly integer code

    The vast majority of the expense in adding another AI character to a scene is that each does a proximity query, maybe some Line-of-Sight tests, has its skeleton animated, does collision detection with its environment, has a shadow calculated and then renders a few thousand elegantly shaded triangles. Now the Cell will do each of these heavily-linear, heavily FP tasks at full speed in parallel.

    The AI Logic section isn't even a blip. "Lots of branches" won't affect performance until 'Lots' becomes a few hundred thousands, where a typical AI does a few dozens maybe a hundred per frame.

    If you have wicked enough code that the branching and integer calculation in AI Logic is your bottleneck, then the Cell has done its job! Maybe the Cell architecture isn't revolutionary, it was designed by many of the same people who designed the Supercomputers that were its precursors. The revolutionary part is that it is so low cost with such low energy consumption.

  14. Pricey... but Interesting on Sony Reader Taking Hold? · · Score: 1
    Maybe it'll root your... book. Har har Lame.

    Anyway, the good bits are that it lasts 7,500 pages per charge and weighs half a pound. The bad news is that it costs $300.

    The bottom line is that I love the idea of not burning a forest of trees... for College textbooks this is a great idea (lessens back pains and you could easily drop $300 in a single quarter!), not to mention point-and-click TOC and index, keyword search, etc. I'll have to see the screen first-hand, but I can't believe it would be better than print. Still, is there really a market for this for the airplane-book-club crowd?

  15. AMD + Rambus Multicore on Rambus Allowed to Continue Patent Dispute Case · · Score: 3, Interesting
    75$ mil deal and AMD gets access to "the good stuff". There's a reason XDR is going into the Cell processor, and it's because 25.6 GByte/s is the right bandwidth to feed 9 cores @ 3.2 Ghz. But it's way, way more than you need for a dinky 1- or 2-core processor (for those you're better off spending money on the super low-latency SRAM instead).

    So does this mean that AMD is jumping on the many-multicore design bandwagon? They must have something up their sleeve...

  16. Re:Then they'd better get 'ready' for multithread on Are three cores better than two? · · Score: 2, Informative
    Simply going from 2 to n cores is not that easy or rewarding as it might sound. First, it's not easy because there are many interdependencies in the way data is accessed and manipulated by games... plus most have a number of global managers for various tasks, and global data leads to lots of sync points in code.

    It's not that rewarding because the memory bandwidth and low-latency local memory must increase as well to be able to feed the computations. In fact, I will guess that even at a massive 25.6 GB/s bandwidth on the PS3, a properly architected game will still be bus-bound.

    So, in the short term parallelization will take the form of tasks that are compute-heavy and don't need to be sync'ed. Cool particles, or cloth sims, or asset streaming and decompression. Then it's a diminishing returns game as we move from 4 procs to n.

  17. Re:Yes, it is. on Blu-Ray vs. HD-DVD Not Over Yet · · Score: 1
    the true future of digital content distribution WILL be online. So all this cacophony is for a temporary technology.

    Not for HD content. How many people have both the bandwidth and the patience to actually download 20 GB of content? Certainly not enough to win the war. That's why you see PC makers whimpering for this Managed Content-- discs are the only viable distribution device for the next decade, and if they can't get that content to the computer Vista isn't really relevant.

    And of course, providers don't trust Microsoft with their content. And consumers don't want to waste that much hard drive space. It's not that free. And it's demanding too much upgrading at once...

    "Now that you have you HDTV, simply upgrade your computer to Vista, buy a 500 GB hard drive cluster, order up a T1 line and PRESTO! You can now watch movies from your Computer... Or you could just use the player that comes with your PS3."

  18. Re:Wikipedia article question on IBM Releases Cell SDK · · Score: 1
    You were off to a really good start! But a couple of things:

    Optimization won't be a problem. At least it won't be the main problem. The instruction set is rich enough to provide scalar and vector integer/fp/dp operations along with both conditional branching and conditional assignment. And it can be programmed in C using intrinsics for SIMD instead of assembly. That brings up the really important part-- 128 128-bit registers. Current x86 compilers suck balls at intrinsics mostly because SSE is such a register-starved instruction set. 128 registers allows lots of unrolling without any premature loads/stores from the stack.

    The main problem is memory. Or structuring a memory-flow so that everything ends up in the local store when it is needed. Many current programs are written pretending to have zero-latency access to everywhere in memory, or they follow several levels of indirection to get to the data they process. Those need to be re-thought so that memory access patterns are predictable and "the meat" gets to the store before it's needed.

    The other memory problem is that these SPEs have their best bandwidth when they talk to each other, and not when they DMA from main memory. However, it's very unclear how to leverage that bandwidth. Certainly the complexity of memory patterns that programmers will have to deal with to get maximum performance dwarfs the problems they will have in optimizing the code that processes the data.

  19. Next-Gen, Riiiiight. on Gavin Carter Discusses Elder Scrolls · · Score: 0, Troll
    We're locking the framerate at 30fps on the 360

    We don't do a lot with dynamic landscape changes beyond things like grass swaying in reaction to the weather.

    No cloth physics unfortunately. We found they are a huge sink for processing time

    So, wait. This game is 'next-gen'!? It sounds like all they did was port their pixel shaders to SM3.0 ...

    Now I'm sure the gameplay is great. But what are they doing with all the extra cycles? There just isn't an excuse to run 30fps any more. Just slapping some over-saturated bloom effects on the framebuffer doesn't cut it.

  20. Intellectual Design on MIT Professor Fired over Fabricated Data · · Score: 5, Funny
    Sometimes when an experiment doesn't go as hoped, its Creator must guide the results intelligently.

    Welcome to Science!

  21. Average IQ increase. on Everything Bad is Good for You · · Score: 3, Funny
    which corresponds to an increase in average IQ scores in the U.S.

    Ah yes, the fabled "increase in average IQ score"... Apparently, we just cracked 100!

    However, I predict that a plateau for the foreseeable future.

  22. Re:what does the slashdot crowd do on Allard 'Gets Real' With IGN · · Score: 2, Insightful
    a 500,000 ton tanker has difficulty changing course, but, lo and behold, that doesn't mean it can't actually change course, SLOWLY, but inevitably

    Sure, but just because the cook announces "Let's go to Norway!" doesn't mean the ship's changing anything.

    See, we've all gotten used to Microsoft (and Intel) talking about doing things. They say lots of things! They're either bashing some competitor, or talking about some future release, it always ends up in some horrific mess that is definitely NOT good for consumers.

    So, when everyone is knee-jerk skeptical about Microsoft's announcements, it's because we'll only really believe it when they've already Done Something. Not just more talk. Microsoft and Intel seem incapable of anything except making announcements.

  23. Re:Is XBOX 360 & HD DVD a sure thing? on Blu-Ray The Flavour of The Moment · · Score: 3, Informative
    Bill Gates on Blu-Ray:

    Gates: Well, the key issue here is that the protection scheme under Blu-ray is very anti-consumer and there's not much visibility of that. The inconvenience is that the [movie] studios got too much protection at the expense consumers [sic] and it won't work well on PCs. You won't be able to play movies and do software in a flexible way.

    And there it is. As simple as it can be. Microsoft wants the PC to be the center of everything. All your movies, email, music... the motherbrain of entertainment. But the only way to get HD content to the PC is through the XBox 360, because HD-content drives won't be available for the PC for quite awhile, and 'downloadable' just isn't an option for Hollywood (not to mention bandwidth constraints). So in Bill's mind, the Xbox 360 is just a content delivery service to keep the PC in power.

    Sony, meanwhile, has no real interest in the PC. In fact, there's absolutely no reason why the PS3 can't be leveraged to take care of the main PC services. Miyamoto has already announced that Linux will ship pre-installed on every PS3 hard drive, just attach a USB keyboard and mouse. IBM is already on board with the Cell, so you see the triumverate forming... with the PC in the corner gathering dust.

    I'm not saying that's the future, I'm just pointing out the battle lines. If Microsoft can't guarantee that content will find its way on to the PC, its plans are very much in disarray.

  24. Re:What I'd REALLY like to know on IGN Talks Games Industry Salaries · · Score: 1
    I wasn't aware that Burger King pays benefits. I sure do like all this health care.

    Anyway, your assumptions are just way wrong. I'd say I'm pretty typical for a game developer. In my career I've worked probably a total of 10 90-hour work weeks. I'd say 85% of my weeks have been fewer than 60 hours, and that the median work-week is about 50 hours.

    Not to mention that there is a lot of intrinsic joy in making games. Solving new and interesting problems constantly. It's like getting paid to solve chess puzzles, including the euphoria of finally producing the solution.

    And. I still have time to work out at the gym every day and cook dinner for my girlfriend every night. Simply because you've heard stories about the bad times (which are true), you can't extrapolate that to "80 hour work weeks, working 50 weeks out of the year". That's nonsense.

  25. The real problem with photorealism on The Future of Videogame Aesthetics · · Score: 3, Insightful
    It looks worse. When you see an attempt photo-realism, the mind is faced with a true/false dilemma and focuses on the details that are wrong. When you see good looking stylized environments, the judgement becomes more aesthetic.

    This is a large reason why Pixar had such a small screen-time of humans in Toy Story, A Bug's Life, Toy Story 2, etc... because humans are really, really honed in to the visual qualities of other humans. If anything looks wrong, an expression, an animation, the skin folding, the hair, cloth, it all looks wrong. Even Geri's Game was very stylized, instead of trying to mimic the photo-realistic visuals of an old man.

    Most artists aren't even capable of it (I guess we should call it "video-realism" instead, since the motion is at least as important as the still image). And for the few that are, it takes a long, long time.