Is Parallelism the New New Thing?
astwon sends us to a blog post by parallel computing pioneer Bill McColl speculating that, with the cooling of Web 2.0, parallelism may be a hot new area for entrepreneurs and investors. (Take with requisite salt grains as he is the founder of a Silicon Valley company in this area.) McColl suggests a few other upcoming "new things," such as Saas as an appliance and massive memory systems. Worth a read.
Someone in another recent thread mention the TRIPS architecture. It's quite interesting reading.
A-Bomb
You see, the majority of the programmers out there don't know much about parallelism. They don't understand what synchronization, mutexes or semaphores are. And the thing is that these concepts are quite complex. They require a much steeper learning curve than hacking a "Web 2.0" application together with PHP, Javascript and maybe MySQL. So if now everybody will start writing multithreaded or otherwise parallel programs, that's going to result in an endless chain of race conditions, mysterious crashes and so on. Rembember, race conditions already killed people.
In common usage, threading usually implies different "streams" of execution doing independent things, at the same time. If the same function is executing in n different threads then you might call it "parallel" programming. A lot of multithreaded programming involves taking pieces of program functionality and breaking them out into separate threads, each executing independently.
Calling the latter architecture parallel computing is misnomer, it is really "simultaneous" computing i.e. things can happen at the same time, but there is a big difference between the same thread executing n times in parallel, and different threads doing different things simultaneously.
For example, a "Trivial" program which reads in a list of numbers from a file, computes something (say the sum of the magnitudes squared), and prints the result out to the screen might be implemented as follows:
while not eof
read n numbers
compute something from them
print result
a multithreaded version might look something like this
Thread 1 (Disk IO): read n number from disk, write them to a queue/shared memory, repeat
Thread 2 (Outputting): wait for outputs to become available, print them, repeat
Thread 3 (Compute): wait for inputs to arrive in a queue, process them, write output to another queue, repeat
A parallel version would just have more than 1 Compute thread, and they would subdivide the work between them (for example 2 threads dividing the input array into stripes, one handling even indices, the other odd...or a bunch of threads computing different slices of the array). Note that the threads would still have to combine their results at the end of the computation, and that is not always simple to do in parallel.
Some problems or algorithms simply cannot execute in parallel. Also issues of memory access patterns, caching, branch divergence (if they threads take different code paths will this affect performance) come into play. It requires a whole new set of issues to worry about, but they are not too difficult. As the professor who teaches the course http://courses.ece.uiuc.edu/ece498/al1/ says, learning parallel programming is not hard, he could teach it to you in 2 hours. But doing parallel programming well and efficiently is difficult. You can write a trivial parallel program which uses just 1 processor and has just 1 thread, which is identical to the sequential version, and it will work logically, although the performance will be severely limited. You can then extend it to use n threads, and it will experience a speedup. But to take full advantage of the hardware on your board, you will need to know a few tricks, and understand the hardware and your program's behavior intimately.
Another issue is that programmers were spoiled by processor upgrades coming along and speeding up their programs "for free" by virtue of their higher clock speed. Now with clock speeds reaching physical limits, the only evolution in new processors will be in the number of cores they have. So the only way to coax more out of a program will be to make it more parallel, and that might be trivial or it might be difficult. We're going to have to think laterally to get more performance out of software.
Hasan