Slashdot Mirror


Wintel, Universities Team On Parallel Programming

kamlapati writes in with a followup from the news last month that Microsoft and Intel are funding a laboratory for research into parallel computing at UC Berkeley. The new development is the imminent delivery of the FPGA-based Berkeley Emulation Engine version 3 (BEE3) that will allow researchers to emulate systems with up to 1,000 cores in order to explore approaches to parallel programming. A Microsoft researcher called BEE3 "a Swiss Army knife of computer research tools."

6 of 91 comments (clear)

  1. "stuck with a ...serial programming model" by BadAnalogyGuy · · Score: 5, Insightful

    It's a little disingenuous to claim that programmers are "stuck" with a serial programming model. The fact of the matter is that multi-threaded programming is a common paradigm which takes advantage of multiple cores just fine. Additionally, many algorithms cannot be parallelized.

    Even languages like Erlang which bring parallelization right to the front of the language are still stuck running serial operations serially. There is sometimes no way around doing something sequentially.

    Now, can we blow a few cycles on a few cores trying to predict which operations will get executed next? Yeah, sure, but that's not a programming problem, it's a hardware design problem.

    1. Re:"stuck with a ...serial programming model" by gdgib · · Score: 4, Insightful

      ParLab is so not about branch predictors and out-of-order execution. As you say, that's a hardware design problem and a solved one at that. Boring.

      While I'll agree that not all programmers are stuck with the serial programming model, threads aren't exactly a great solution (http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.html). They're heavyweight and inefficient compared to running most algorithms on e.g. bare hardware or even an FPGA. Plus they deal badly with embarrasing parallelism (http://en.wikipedia.org/wiki/Embarrassingly_parallel). And finally they are HARD to use, the programmer must explicitly manage the parallelism by creating, synchronizing and destroying threads.

      Setting aside those problems which exhibit no parallelism (for whom there is no solution but a faster CPU really), there are many classes of problems which would benefit enormously from better programming models, which are more efficiently tied to the operating system and hardware rather than going through an OS level threading package.

    2. Re:"stuck with a ...serial programming model" by robizzle · · Score: 4, Insightful

      You are right, programmers aren't currently "stuck" with a serial programming model; however, looking into the future it is pretty clear that the hardware is developing faster than the programming models. Systems with dozens and dozens of cores aren't far off but we really don't have a good way to take advantage of all the cores.

      In the very near future, we could potentially have systems with hundreds of cores that sit idle all the time because none of the software takes advantage of much more than 5-10 cores. Of course, this would never actually happen, because once the hardware manufacturers notice this to be a problem, they will stop increasing the number of cores and try to make some other changes that would result in increased performance to the end user. There will always be a bottleneck -- either the software paradigms or the hardware and right now it looks like in the near future it will be the software.

      Yes, there are some algorithms that no matter what you do have to be executed sequentially. However, there is a huge truckload of algorithms that can be rewritten, with little added complexity, to take advantage of parallel computing. Furthermore, there is a slew of algorithms that could be rewritten with a slight loss in efficiency to be parallelized but with a net gain in performance. This third type of algorithm is what I think the most interesting is for researchers -- Even though parallelizing the algorithm may introduce redundant calculations or added work, the increased number of workers outweighs this.

      In other words, what is more efficient: 1 core that performs 20,000 instructions in 1 second or 5 cores that each perform 7,000 instructions, in parallel, in 0.35 seconds. Perhaps surprisingly to you, the single core is more efficient (20,000 instructions instead of 7,000*5 = 35,000 instructions) -- BUT, if we have the extra 4 cores sitting around doing nothing anyways, we may as well introduce inefficiency and finish the algorithm about 2.9 times faster.

  2. Re:1000 cores? by SeekerDarksteel · · Score: 3, Insightful

    1) Quantum computing != parallel computing.

    2) A significant number of applications can and do run on 1000+ cores. Sure, most are scientific apps rather than consumer apps, but there is a market for it nevertheless. Go tell a high performance computing guy that there's no need for 1k cores on a single chip and watch him collapse laughing at you.

    --
    The laws of probability forbid it!
  3. Re:kinda silly by gdgib · · Score: 3, Insightful

    Interestingly enough, Dave Patterson http://www.eecs.berkeley.edu/Faculty/Homepages/patterson.html, once president of ACM http://membernet.acm.org/public/membernet/storypage_2.cfm?ci=June_2006&announcement=1&CFID=1668767&CFTOKEN=37941036 was once on a project to do that http://iram.cs.berkeley.edu/. Now he's working on ParLab http://parlab.eecs.berkeley.edu/. I don't always agree with him (and vice versa) but he's nobody's fool.

    Faith, young grasshopper...

    If you want a more technical reason DRAM and CPU's don't go together, spend an informative hour looking up the IC fab process for CMOS logic (CPUs) and DRAM. They're VERY VERY different. DRAM needs capacitory density to get the price-per-bit down so they use their own custom fabs optimized for that. This makes it really hard to fit lots of logic and DRAM on to one chip.

  4. Parallel Computing is not magic by technienerd · · Score: 4, Insightful

    I'm about to start a graduate degree in this area so I'm a little biased. However, I think a lot of problems can be solved in parallel. For example, maybe, LZW compression as it's implemented in the "zip" format might not be easily parallelizable but that doesn't prevent us from developing a compression algorithm with parallelism in mind. I did some undergraduate research in parallel search algorithms and I know for a fact that there are many, many ways you can parallelize search. Frankly, saying that you can't parallelize algorithms is a bit closed minded. Many problems don't inherently require serial solutions, it's just current algorithms handle them that way. Rather than trying to implement existing algorithms on a massively parallel processor, you want to re-tackle the problem under a new model, a model of an arbitrary number of processors. You build around the idea of data-parallelism rather than task-parallelism. Many, many things are possible under this model and I think it's naive to think otherwise. You don't need to think, how do I juggle 1000 threads around, you think, how do I take a problem, break it up into arbitrarily many chunks and distribute those chunks to an arbitrary number of processors and how do I do all that scheduling efficiently? This model doesn't work for interactive tasks mind you (where you're waiting for user input), but I'm very confident a model can be developed that can.