Is Parallelism the New New Thing?
astwon sends us to a blog post by parallel computing pioneer Bill McColl speculating that, with the cooling of Web 2.0, parallelism may be a hot new area for entrepreneurs and investors. (Take with requisite salt grains as he is the founder of a Silicon Valley company in this area.) McColl suggests a few other upcoming "new things," such as Saas as an appliance and massive memory systems. Worth a read.
When I was in graduate school in the mid '90's I thought Parallelism would be the next big thing. Needless to say I was a bit early on that prediction. Finally maybe those graduate classes and grant work will pay off. :-)
Think Deeply.
Someone in another recent thread mention the TRIPS architecture. It's quite interesting reading.
A-Bomb
This seems far, far too low. Admittedly I work in a place that does "parallel programming," but it still seems awfully low.
This post climbed Mt. Washington.
Paul ?
Paul Otellini ?
I didn't know you posted on slashdot !
So what's up man ? Can I buy you a beer ?
the guy has a "startup in stealth mode" called parallel computing. Of course he wants to generate buzz.
Decade after decade, people keep trying to sell silver bullets for parallel computing: the perfect language, the perfect network, the perfect os, etc. Nothing ever wins big. Instead, there is a diversity of solutions for a diversity of problems, and progress is slow but steady.
A guy who's made it his life's work to study Parallel Computing has come forth to say, he thinks Parallelism is the next big thing?
Shock! And Awe!
For having been in the computer industry for too long, I reckon the "next hot thing" usually means the "latest fad" that many of the entrepreneurs involved in hope will turn into the "next get-rich-quick scheme".
Because really, anybody believes Web-Two-Oh was anything but the regular web's natural evolution with a fancy name tacked on?
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
Now that we are seeing more and more in the way of multi-core CPUs and multi-CPU computers I can definitely see parallelism become more important, for task that can be handled this way. You have to remember that in certain cases trying to parallise a task can end up being less efficient, so what you parallelise will depend on the task in hand. Things like games, media application and scientific applications are usually likely candidates since they are either doing lots of different things at once or have tasks that can be split up into smaller units that don't depend on the outcome of the other. Server applications can to a certain extent, depending whether they are trying to the same resources or not (ftp server, accessing this disk, vs a time server which does not file I/O).
One thing that should also be noted, is that in certain cases you will need to accept increased memory usage, since you want to avoid tasks locking on resources that they don't really need to synchronise until the end of the work unit. In this case it may be cheaper to duplicate resources, do the work and then resynchronise at the end. Like everything it depends on the size and duration of the work unit.
Even if your application is not doing enough to warrant running its tasks in parallel, the operating system could benefit, so that applications don't suffer on sharing resources that don't need to be shared.
Jumpstart the tartan drive.
Oh yes, here it is.
And the conclusion?
It's been around for years numbnuts, in commercial and server applications, middle tiers, databases and a million and one other things worked on by serious software developers (i.e. not web programming dweebs).
Parallelism has been around for ages and has been used commercially for a couple of decades. Get over it.
Not parallelism... Why do MBA idiots have to fill everything with their crap? Now they'll start creating buzzwords, reading stupid web logs (called "blogs"), filling magazines with acronyms...
Coming soon: professional object-oriented XML-based AJAX-powered scalable five-nines high-availability multi-tier enterprise turnkey business solutions that convert visitors into customers, optimize cash flows, discover business logic and opportunities, and create synergy between their stupidity and their bank accounts - parallelized.
I was about to say 13256278887989457651018865901401704640, but it appears this number is private property.
So all-of-a-sudden people have discovered parallelism? Gee, one of the really interesting things about Ada in the late 80s was its use on multiprocessor systems such as those produced by Sequent and Encore. There was a lot of work on the language itself (that went into Ada95) and on compiler technologies to support 'safe parallelism'. "Safe" here means 'correct implementation' against the language standard, considering things like cache consistency as parts of programs get implemented in different CPUs, each with its own cache.
Here are a couple of lessons learned from that Ada experience:
1. Sometimes you want synchronization, and sometimes you want avoidance. Ada83 Tasking/Rendezvous provided synchronization, but was hard to use for avoidance. Ada95 added protected objects to handle avoidance.
2. In Ada83, aliasing by default was forbidden, which made it a lot easier for the compiler to reason about things like cache consistency. Ada95 added more pragmas, etc, to provide additional control on aliasing and atomic operations.
3. A lot of the early experience with concurrency and parallelism in Ada learned (usually the hard way) that there's a 'sweet spot' in the number of concurrent actions. Too many, and the machine bogs down in scheduling and synchronization. Too few, and you don't keep all of the processors busy. One of the interesting things that Karl Nyberg worked on in his Sun T1000 contest review was the tuning necessary to keep as many cores as possible running. (http://www.grebyn.com/t1000/ ) (Disclosure: I don't work for Grebyn, but I do have an account on grebyn.com as a legacy of the old days when they were in the ISP business in the '80s, and Karl is an old friend of very long standing....)
All this reminds me of a story from Tracy Kidder's Soul of a New Machine http://en.wikipedia.org/wiki/The_Soul_of_a_New_Machine. There was an article in the trade press pointing to an IBM minicomputer, with the title "IBM legitimizes minicomputers". Data General proposed (or ran, I forget which) an ad that built on that article, saying "The bastards say, 'welcome' ".
dave
Sorry guys, web 2.0 was never cool and never will be.
mcgrew's razor: Never attribute to stupidity that which can be explained by greedy self-interest
Now that multi-core computers have been out I keep hearing buzz around the idea of parallel computing, as if it is something new. We've had threads, processes, multi-CPU machines, grid computing, etc etc for a long time now. Parallelism has been in use on single processor machines for a long time. Multi-core machines might make it more attractive to thread certain applications that were traditionally single-threaded, but that's the only major development I can see. The biggest problem in parallel computing is the complexity it adds, so hopefully developments will be made in that area, but it's an area that's been researched for a long time now.
You see, the majority of the programmers out there don't know much about parallelism. They don't understand what synchronization, mutexes or semaphores are. And the thing is that these concepts are quite complex. They require a much steeper learning curve than hacking a "Web 2.0" application together with PHP, Javascript and maybe MySQL. So if now everybody will start writing multithreaded or otherwise parallel programs, that's going to result in an endless chain of race conditions, mysterious crashes and so on. Rembember, race conditions already killed people.
In common usage, threading usually implies different "streams" of execution doing independent things, at the same time. If the same function is executing in n different threads then you might call it "parallel" programming. A lot of multithreaded programming involves taking pieces of program functionality and breaking them out into separate threads, each executing independently.
Calling the latter architecture parallel computing is misnomer, it is really "simultaneous" computing i.e. things can happen at the same time, but there is a big difference between the same thread executing n times in parallel, and different threads doing different things simultaneously.
For example, a "Trivial" program which reads in a list of numbers from a file, computes something (say the sum of the magnitudes squared), and prints the result out to the screen might be implemented as follows:
while not eof
read n numbers
compute something from them
print result
a multithreaded version might look something like this
Thread 1 (Disk IO): read n number from disk, write them to a queue/shared memory, repeat
Thread 2 (Outputting): wait for outputs to become available, print them, repeat
Thread 3 (Compute): wait for inputs to arrive in a queue, process them, write output to another queue, repeat
A parallel version would just have more than 1 Compute thread, and they would subdivide the work between them (for example 2 threads dividing the input array into stripes, one handling even indices, the other odd...or a bunch of threads computing different slices of the array). Note that the threads would still have to combine their results at the end of the computation, and that is not always simple to do in parallel.
Some problems or algorithms simply cannot execute in parallel. Also issues of memory access patterns, caching, branch divergence (if they threads take different code paths will this affect performance) come into play. It requires a whole new set of issues to worry about, but they are not too difficult. As the professor who teaches the course http://courses.ece.uiuc.edu/ece498/al1/ says, learning parallel programming is not hard, he could teach it to you in 2 hours. But doing parallel programming well and efficiently is difficult. You can write a trivial parallel program which uses just 1 processor and has just 1 thread, which is identical to the sequential version, and it will work logically, although the performance will be severely limited. You can then extend it to use n threads, and it will experience a speedup. But to take full advantage of the hardware on your board, you will need to know a few tricks, and understand the hardware and your program's behavior intimately.
Another issue is that programmers were spoiled by processor upgrades coming along and speeding up their programs "for free" by virtue of their higher clock speed. Now with clock speeds reaching physical limits, the only evolution in new processors will be in the number of cores they have. So the only way to coax more out of a program will be to make it more parallel, and that might be trivial or it might be difficult. We're going to have to think laterally to get more performance out of software.
Hasan