Slashdot Mirror


Grand Unified Theory of SIMD

Glen Low writes " All of a sudden, there's going to be an Altivec unit in every pot: the Mac Mini, the Cell processor, the Xbox2. Yet programming for the PowerPC Altivec and Intel MMX/SSE SIMD (single instruction multiple data) units remains the black art of assembly language magicians. The macstl project tries to unify the architectures in a simple C++ template library. It just reached its 0.2 milestone and claims a 3.6x to 16.2x speed-up over hand-coded scalar loops. And of course it's all OSI-approved RPL goodness. "

5 of 223 comments (clear)

  1. The future by johnhennessy · · Score: 3, Insightful

    Surely people can now start to see where the future lies - from a performance viewpoint. We've reached the end of the clocking "free lunch" (see http://www.gotw.ca/publications/concurrency-ddj.ht m/).

    The way forward is turning the CPU (of a traditional) architecture into a Nanny for a range of various dedicated processing units. IBM saw this years ago, and thus began the whole Cell architecture - but I suspect that their job was much easier. The software that would run on the platform they are designing is fairly specific - games & multimedia which usually lend themselves well to vectorization.

    The real challenge for architects (in my humble opinion) is translating will be applying the same technique to other system bottlenecks.

    AMD's (and now Intel's) approach of crambing more and more processing cores onto an IC might pay off in the short term, but like the "free lunch" of clock speed, will hit a roadblock when issues like memory bandwidth and caching schemes just have too much work to do with 4 or 8 processing cores hacking at it all the time.

    --
    [ Monday is a terrible way to spend one seventh of your life. ]
  2. Depends on what you are doing by dsci · · Score: 5, Insightful

    We write code for hardcore chemical simulations. The limits on what can be studied, ie number of atoms/molecules or timescales of the simulations depends on one thing: speed.

    Faster computers means better simulations. BUT, if the code is not as fast as it can be on a particular architecture, your simulations are not going to be as complete as they can be. At least within a given time allotment.

    I've recently applied some code optimizations to a Monte Carlo simulation and saw speed ups of over 1000x. That's significant.

    It's naive to think that faster computers means we should live with sloppy or unoptimized code. SIMD is a useful technique, and if it means the difference between me getting work done in a week or two or three weeks, I think I'll take the one-week sim.

    --
    Computational Chemistry products and services.
  3. Why limit yourself to Altivec when you have NVidia by kompiluj · · Score: 3, Insightful

    Well the processing power of Altivec or MMX/SSE/3DNow or whatever is nowhere near the power of you newest NVidia/ATI card you have surely bought for playing Doom III. Why not use it then? Get the brook compiler! Furthemore, I see they introduce classes like vec, etc. Such classes have been already designed successfuly for C++. Why not try porting Blitz to the Altivec and/or to the GPU?

    --
    You can defy gravity... for a short time
  4. Re:Moore's Law has eroded the need for assembly by groomed · · Score: 3, Insightful

    Sorry, but yours is an utterly kneejerk boilerplate response which has nothing to do with the topic at hand and only serves to establish your credentials as a hard nosed realist who has been there and done it.

    Moore's Law has eroded the need for such knowledge

    Moore's "law" (which is just an off-the-cuff observation, really) has nothing to do with this. If anything, Moore's law has enabled transistor and space devouring SIMD technology.

    It would be like concerning myself on how to design circuits...

    No, it's nothing like that at all. Just because you own and know how to use money doesn't mean there is no point to the complex financial reckonings that are made every day at institutions all over the world. You may not need, but you is not under discussion.

    Yes some people who write games are still concerne with assembly as are people in embedded markets. But those jobs, situations and skills are niche

    By this definition, everything is niche. The whole computing industry becomes "niche". Farming is "niche". The paper industry is "niche". What you're describing is just non-descript white collar administrative work which just happens to involve a computer; bit shuffling, rather than paper shuffling.

    Those situations are about the last place you will find anyone caring about something called "assembly language."

    Again, completely irrelevant.

    The point is that with a few dozen lines of SIMD code (whether in assembly or some high level language) any reasonably competent programmer can achieve four-fold, ten-fold, even twenty-fold speedups on critical path code, from scratch, in as little as a week.

    These are amazing results, and people should be encouraged to investigate the possibilities, not be dragged down into this drab netherworld of yours.

  5. Re:More AltiVec Goodness by bryanzak · · Score: 3, Insightful

    One of the problems of using libraries though is that the overhead of a function call usually negates any gain in vectorization. The lib call messes all kinds of things up, including instruction flow and caching, etc.