An Overview of Parallelism
Mortimer.CA writes with a recently released report from Berkeley entitled "The Landscape of Parallel Computing Research: A View from Berkeley: "Generally they conclude that the 'evolutionary approach to parallel hardware and software may work from 2- or 8-processor systems, but is likely to face diminishing returns as 16 and 32 processor systems are realized, just as returns fell with greater instruction-level parallelism.' This assumes things stay 'evolutionary' and that programming stays more or less how it has done in previous years (though languages like Erlang can probably help to change this)." Read on for Mortimer.CA's summary from the paper of some "conventional wisdoms" and their replacements.
Old and new conventional wisdoms:
Old and new conventional wisdoms:
- Old CW: Power is free, but transistors are expensive.
- New CW is the "Power wall": Power is expensive, but transistors are "free." That is, we can put more transistors on a chip than we have the power to turn on.
- Old CW: Monolithic uniprocessors in silicon are reliable internally, with errors occurring only at the pins.
- New CW: As chips drop below 65-nm feature sizes, they will have high soft and hard error rates.
- Old CW: Multiply is slow, but load and store is fast.
- New CW is the "Memory wall" [Wulf and McKee 1995]: Load and store is slow, but multiply is fast.
- Old CW: Don't bother parallelizing your application, as you can just wait a little while and run it on a much faster sequential computer.
- New CW: It will be a very long wait for a faster sequential computer (see above).
pretty much the same thing Dave Patterson's been saying for a while now...in fact, the CW sounded so familiar, I went back to double check his lecture slides from more than a year ago:
/ Cs252s06-lec01-intro.pdf
http://vlsi.cs.berkeley.edu/cs252-s06/images/1/1b
and it's pretty much identical (check out slide 3 on the first page of the pdf)
I think parallelism can be achieved elegantly using languages that express what is to be done, rather than how it is to be done. Functional programming is a major step in the right direction. Not only do functional programs typically more clearly express what is to be done (as opposed to which steps are to be taken to get there), they also tend to cause fewer side effects (which restrict the correct evaluation orders). In particular, not using variables avoids many of the headaches traditionally involved in multithreading.
Please correct me if I got my facts wrong.
I don't think you were listening very carefully to the talk (or know much about Computer Architecture) if you think Dave Patterson is a supercomputer guy. Perhaps you've heard of the Hennessy & Patterson Quantitative Approach to Computer Architecture book (you know, the one used at basically every university to teach about computer architecture). Patterson has been involved in a lot of different things within computer architecture over the years, including being one of the main people behind RISC and RAID (as well as being the president of the ACM). I saw his talk when it was given at Berkeley, and you really missed the point if you thought it was about supercomputing. The talk was about the future of computing in general, which is increasingly parallel, in case you're unaware of that fact. GPUs are already at 128 cores, Network processors are up to 200 cores. Intel is going to present an 80 core x86 test chip tomorrow at ISSCC. Physics won't support faster single core processors at the rate we're accustomed to, so the whole industry is going parallel, which is a sea change in the industry. Patterson's talk is aimed at the research community, since we don't have good answers as to how these very parallel systems should be architected and programmed. FPGA emulation is a great way to play around with massive multiprocessor configurations and programming strategies, which is why Patterson is advocating it (his RAMP project has researchers from MIT, Berkeley, Stanford, Texas, Washington involved (among others)). You also need to have a little more imagination about what we could do with more computing power. Try looking at Intel's presentations on RMS http://www.intel.com/technology/itj/2005/volume09i ssue02/foreword.htm.
I think using a *specific* language for automatic parallelization is the wrong way. Some GNU folks are working on language independent autoparallelization for GCC 4.3. Their implementation is an extension to the OpenMP implementation in GCC. Read OpenMP and automatic parallelization in GCC, D. Novillo, GCC Developers' Summit, Ottawa, Canada, June 2006 http://people.redhat.com/dnovillo/Papers/ for details.