Choice Overload In Parallel Programming
scott3778 writes to recommend a post by Timothy Mattson over at Intel's Research Blog. He argues, convincingly, that the most important paper for programming language designers to read today is one written by two social psychology professors in 2000. This is the well-known academic study, "When Choice is Demotivating: Can One Desire too Much of a Good Thing?" "And then we show them the parallel programming environments they can work with: MPI, OpenMP, Ct, HPF, TBB, Erlang, Shmemm, Portals, ZPL, BSP, CHARM++, Cilk, Co-array Fortran, PVM, Pthreads, windows threads, Tstreams, GA, Java, UPC, Titanium, Parlog, NESL,Split-C... and the list goes on and on. If we aren't careful, the result could very well be a 'choice overload' experience with software vendors running away in frustration."
Because I'm sick (in the head) I say we go with the Fortran option!
'twas my second language; after BASIC. Ahhh, the fond memories...
What's wrong with OpenMP?
"Politicians and diapers must be changed often, and for the same reason."
http://www.columbia.edu/~ss957/whenchoice.html
Those who would give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety.
Microsoft will come along and tell you what your choice will be.
I don't remember reading anywhere that you have to pick all of them. Just go through the list like "ignore, ignore, ignore, aha I know that one" and pick one that you like or are good at and use that one. Then you ignore the rest and I just don't see where the overload happens. Maybe, possibly if a company is dumb enough to let every programmer just do whatever and nobody can read and understands each others' code and nothing works together cuz they all chose something different then yeah, overload time but that's just idiotic.
Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
Write concurrently in two languages, then you're sure to make full use of available CPU cores.
This whole idea of 'choice overload' is so much drivel, IMHO. And, no, I'm not trying to flame here.
Have you ever known anybody to say: "There are just too many girls to choose from, I guess I'll go hide in the basement."?
Or: "There are ten thousand restaurants in this city. I just can't cope. I'm going to stop eating."?
A better label for the whole subject would be: " How a small minority of people fail to learn tree-pruning techniques, and dissolve in panic." Then we all could say: "Yep, sounds like my ex-girlfriend. Been there, done that. Next?"
I, for one Welcome our new Choice Overloads....
Quoth the blogger: "With hundreds of languages and API's out there, is anyone really dumb enough to think "yet another one" will fix our parallel programming problems?"
Yet Intel touts its Threading Building Blocks library as just such a fix to many parallel programming problems. Now, TBB is a very nice product, and in many ways it is superior to a lot of existing libraries, APIs, and languages, but one gets the sense that maybe the left hand doesn't know what the right hand is doing at Intel.
I might also draw an analogy to the open source world, where there are often dozens of solutions to both simple/mundane problems (text editors, media players, command line shells, etc) and more complex ones (window managers, Linux distributions, etc). I wonder if the free and open source software world wouldn't also benefit from a "culling of the herd," so to speak.
Just write each thread in a different language!
If we aren't careful, the result could very well be a 'choice overload' experience with software vendors running away in frustration.
It's true! Just look at how the large number of available programming languages is driving people away from software development. If we could only limit it to just a few options then people might not be so afraid to start programming.
Or we could be serious for a minute and say that a useful marketing tactic is perhaps not "the most important paper for programming language designers to read today".
While many of these technologies overlap, I think that each serves a niche that is important. I think it would be very difficult to design a language that is easy to use which combines the whole superset of what all of them cover. IBM is attempting to try and create a langugae that covers a lot of it with its experimental X10 languagehttp://domino.research.ibm.com/comm/research_projects.nsf/pages/x10.index.html. While it works, it can be very painful to program in. In the case of parallel and concurrent programming, I think choice is good.
"We (that is, computer companies) want to sell hardware. To do that, we need software."
of course... efficiency doesn't matter, it's even sometimes detested by those selling bigger CPUs.
Parallel programming frameworks are not an assortment of jams with a coupon. In addition to a qualitative (like / dislike) evalutaion of the code style and API there are quantitative measurements of how these frameworks handle different tasks and the results that they produce when applied to various problems.
Now... I'm no big python user, but Stackless python is at least interesting and I would do myself a disservice by not evaluating it. I need to know why it is or isn't quantitatively better.
The only problem that exists with too many choices in computing is supporting and not forcing the deprecation of legacy APIs (something like Java Collections Vector vs ArrayList, extensive use of String vs CharSequence, etc).
and I say that as an atheist.
Ok, first: he writes as if all choices are equivalent. One jam might as well be the same as another, they just differ by taste. It's not like I walk into the store already invested raspberries. It's not as if Java programmers are going to decide that the Fortran parallel library is better, so why not just switch to Fortran.
Second, I doubt explicit parallel programming is going to be mainstream anytime soon. No, make that ever. Ever! Parallel programming will only happen in the mainstream when it is handled implicitly by the language, like a dataflow language. Asking normal programmers to deal with parallel programming is trouble when basic logic eludes most of them.
Third, all you people, including the author of TFA, who think that more than one or two standards is bad thing ("the great thing about standards is there are so many to choose from!") it's time to wake up: the world is not about to consolidate. The future is going to require C3PO and R2D2: there will be so many fricking languages and standards that your translator is going to require AI and legs to come along with you. For every one thing that fades away, eventually, probably 10 or 100 replace it. The future is a big mess.
must... stay... awake...
Choice is good if it provides different tools for different tasks. The list provided is somewhat silly, since several of the technologies address completely different issues and applications. There's a reason Sears sell thirty different shapes of hammers -- all nails are not the same.
After considerable deliberation and experimentation, I've shosen OpenMP for most task-parallel applications. The syntax is simple, it operates across C, C++, and Fortran, and it is supported by most major compilers on Linux, Windows, and Sun. The only quirk has been problematic support in GCC 4.2, but that will likely be cleared up within a few months. For cluster work, I tend to use MPI, because it has a long history and good support. I'm sure other tools have good versatility in environments different from those I frequent.
All about me
On top of that, if this really is something that affects programmers then why the hell aren't we all rendered utterly useless by the number of programming languages? Or all the possible ways one could format code? Etc.
But hey, the guy's writing in a "research" blog and, as in academia, when you don't have anything real to contribute you can cite something completely unrelated and pretend it has relevance.
Honestly, this sounds vaguely like "there's too much to choose from, so everyone just use Intel Thread Building Blocks, K? You can't possibly do better so just use our stuff because we cover all cases..."
[And next time I won't hit submit and then reply to myself when I meant to just hit preview...I hope]
Too many choices between languages, where in many of them the pros and cons balance and cancel for your particular application, can be deadly.
I read a lot of whining in the comments for this article. Let me put this in a different perspective: this is more like a frontier in programming that we're on here. In the past, a single core processor running serial code was fine on any desktop. Any applications needing parallel programming was run on high-end servers or Beowulf Clusters. They weren't for the average computer user out there.
Now we have all these new-fangled dual/tri/quad core processors in the average microcomputer. It would be foolish to let all that computing power go to waste by running code is series. It just became economical for just about every application to be written in parallel.
And by-the-way... GET OFF MY LAWN!
The game.
"All you have to do is be fragile and grateful. So stay the underdog." Chuck Palahniuk, Choke
Speaking of being able to "prune" your tree of possible development choices, Java should be the first to go.
I've been gradually trying to learn more about functional programming, partly because I think fp techniques and ways of thinking come in handy even if you're programming in a procedurally oriented language, and partly because fp seems like a paradigm that is likely to get more and more useful as we get machines with more and more cores. Okay, fp!=parallel, but, e.g., one of the big selling points of Erlang is supposed to be that it lends itself to completely transparent use of parallel processors.
The choice overload does seem like kind of an issue to me. For as long as I continue to keep programming comfortably in the procedural languages I'm comfy with (e.g., perl), I'm never going to really wrap my mind around the radically different ways of thinking that you get in a more fp world. I'm been thinking for a long time that it would be fun to do a coding project in ocaml ... or haskell ... or lisp ... or erlang ... or -- you get the idea.
The trouble is, it's really not clear what to hitch my wagon to. Ocaml seems to have a very high quality implementation, but its garbage collector isn't multithreaded, the only book you can buy is in French (it's nice that you can download the English version for free, but I'd prefer to buy something bound), and the availability of libraries (and documentation for them) isn't quite as wonderful as I've gotten used to with perl. Lisp could be cool, but I hate the fact that it's not standardized, and I'm not convinced that eschewing arbitrary syntax really carries more pros than cons. Haskell? Maybe, but it sounds like putting on a hair shirt. The list goes on. I really feel like a deer in the headlights.
Find free books.
There are hundreds of languages that support loops, variable assignments, recursion, definition of subroutines and Joe knows what else.
Language constructs to support mp are bound to be just as numerous. I'm not normally one to be so dismissive of a post, but I think this one of the more pointless items ever shared with erudite little community.
That's just silly. There are two types of programmers that could be making choices like this, and neither one of them would suffer from too much choice.
The first kind is a programmer just trying to paralelize existing code. In that case, the choice of threading platforms is pretty much obvious. Existing Windows code? Use windows threads. C/C++ on Unix? Pthreads probably. Java code? Java threads... Probably not even 2 seconds worth of thought will go into considering the alternatives (and that's probably fine)
The other type of programmer is one who's actively looking to develop high performance paralelized software. I am talking about cases where performance is the primary objective and it drives the choice of programming language and platform. In these cases, the nuance of the different thread models might matter but the programmer of this type would be happy (rather than scared) to investigate all the options. After all, if he didn't care, he'd just go with the default choices like the first programmer.
http://ed.markovich.googlepages.com
A nice video about the The Paradox of Choice is available at Google Video. It is an interesting topic, but I don't think it applies all that much to parallel programming. The issue isn't that there are to many languages, but simply that there are a bunch of very well established languages that provide you little to no help with writing parallel programs properly, so everybody just continues to write their programs the way they did the last 20 years and thus takes little or no advantage of the available multiprocessor systems. And I doubt that just reducing the choice would help much at all about that right now, since we really still don't know how to write parallel programs on a large scale (i.e. in a way that everybody can and does it), so some more research and experimentation is needed.
Java's advantage is the massive library and the GUI toolkits. Of course, GUI shouldn't be relevant to things like massively parallel server programs...
Care about privacy? Read this!
Just use RubyMPI when I release it next week :)
The power of MPI wrapped in the beauty of Ruby.
http://www.public.iastate.edu/~crb002/
bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME
In Soviet Russia the new collective processing system chooses you!
Okay this list seems to be of several different technologies some of which over lap but several are used for very different tasks. You can not replace MPI with Pthreads.
I don't see the problem. Just as we have many different programing languages these different interfaces all have different niches.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
It's kind of amusing looking at the languages he lists. MPI and OpenMP are by far the most-used environments, but pthreads and java should probably be next not at the end of the list. Ct, intel's new parallel language, hasn't even been formally announced yet let along there being any released documentation / code for it. CUDA however, Nvidia's competing parallel language, isn't even mentioned though it's been released for months now.
With hundreds of drugs out there, is anyone really dumb enough to think "yet another one" will fix their illness? With hundreds of energy sources out there, is anyone really dumb enough to think "yet another one" will solve our problems? With hundreds of CPU/computer architectures, is anyone really dumb enoguh to think "yet another one" will solve our problems? Sometimes the answer is yes, because it's not "just another" one. Sometimes the first hundred alternatives all suck, but the hundred-and-first is the one that puts all the others to shame. Look up the story of Edison trying different materials for the first light bulb. Choice overload is a real problem, sometimes #101 sucks just as much as 1-100, but that's no excuse for stifling innovation. Parallel programming languages and frameworks are not like jam. They're not matters of personal taste alone. There are empirical measures by which one can determine whether one approach works better than another, either in one environment or across a variety. Also, again unlike jam, parallel programming is still a poorly understood problem domain. The potential for major conceptual breakthroughs is still there, and by their very nature we cannot predict when or where such breakthroughs will occur. Therefore a lot of different people have to try a lot of different things. That's how innovation works, and how progress is made. Standardizing too early is at least as bad as standardizing too late. Competition works. It might be messy sometimes, but it still beats the alternatives.
Slashdot - News for Herds. Stuff that Splatters.
What does a web browser need more than one core for? Or a word processor? Or an IM client? The only "desktop PC" type tasks I can think of that might actually be able to saturate even a single CPU are multimedia and gaming. In the case of the former, it's usually enough for the OS put the media player's threads on one core and everything else on another so they don't have to fight. In the case of the latter, well, video games hardly qualify for "just about every application".
Part of the problem is that there isn't a good solution yet, so there's a lot of effort being put into trying to find a way for a bad solution to be more comfortable.
Old-school iterative languages are a clumsy fit. They're night impossible to debug, and ones that let you do clever things at the hardware level will bring the whole project down in screaming flames when someone tries to get clever. So new libraries for old languages seldom fill the bill.
New-hotness functional languages are insane. It's very, very, very difficult for seasoned programmers to get their heads around it, and impossible for n00bz who don't have heavy math backgrounds. Compounding the issue is that the syntax tends to be on the wrong side of horrible with little or no syntactic sugar to make the medicine go down. So re-imagining the paradigm is a bit like picturing a five dimensional sphere - great fun, if you're smart enough to do it. No-one is smart enough to do it.
We're probably looking at a problem space that is best tackled by something that doesn't exist yet - an elegant, easily understood tool that simply makes sense, like objects or everything-is-a-file or scripting languages or regex. We're seeing so many different approaches to MPP because programmers are trying to figure out what that tool is. Once someone hits on it, the field will shake itself out.
Since we haven't hit on it, too much choice is a good thing - it means people will take the initiative to do something on their own that works better, rather than trying to get something suboptimal to work because it's the "standard".
Are there any electricians, or mechanics here who have the problem of too much choice when they go over to Sears, or where it is that you shop at?
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
Ok, there's a lot of different concurrency packages...but there's a lot of different programming languages, too. I could think of at least two dozen, and you probably could too. That make it hard for you to sit down and write code? No? Just like most code gets written in C or Java or any of a small handful of scripting languages, there's a few standout parallel programming packages (pthreads, MPI, Java's built-in). Cilk? Who uses Cilk? If you want to write threaded code in C, you probably use pthreads, because that's what everyone else uses and has experience with. Not much of a choice overload. Unless you're researching concurrency packages, you don't need to know about most of the stuff on that list.
And I say that as an agnostic.
This guy must be missing the point of having different programming languages and environments - parallel or not. He lists ZPL, which is, first and foremost in my opinion, a really cool array-based language. There are certain things you're going to want to do in ZPL as opposed to non-array based languages, such as image processing (which lends itself really well to parallel processing IMHO). For things that don't require non-multi-dimensional array processing, you wouldn't want to use ZPL.
"...today consumers have been conditioned to think of beer when they see a bullfrog..."
This "choice overload" is just a symptom of bad computer science in general: nobody knows whether any of those systems is "better" than any of the others. Nobody even knows what "better" would mean.
Furthermore, the academic process rewards people not doing the work to find out. If you spent six months to find out that your hot new idea is actually (1) worse than what was there before and (2) not so new anyway, you don't get tenure because you don't publish enough.
Just use pthreads and forget that other nonsense.
Here's the reasoning. To do heavy multi-thread parallelism to speed up some kind of multi-media, game, data visualization program, you probably want a higher-level language with garbage collection to handle some kind of data flow model -- say Java with a good class libary to support this -- in place of C with pthreads and trying to place locks on data without the whole thing deadlocking. Assume for sake of argument that Java is 4 times slower than C -- yeah, yeah, JIT compilers and loop and bounds checking optimizations, but there are C/C++ compilers that are really heavily optimized.
So what good does 4 processors do you? I am thinking you are better off optimizing the heck out of something in C++ to run on one processor than to get your marvelous data-flow class library working on 4 processors. Now if you had a 100 processors, now you are talking about abandoning optimized C code with optimizing C compilers for something like Java in order to go for the parallelism without sweating P-threads.
Suppose you go the route Intel is pushing, of an optimizing C++ compiler that identifies loops that can be unrolled and made parallel and automatically creates, launches, synchronizes, and destroys threads behind the scenes. Well, if you had a compiler that could do that, wouldn't it be even more effective with whatever vector-processing capability is not utilized on Intel chips because no one has a compiler for it, or perhaps compiling to a GPU?
On one hand, I heard that 100 processor chips are in the development pipeline. On the other hand, I heard that maybe 16 processors is an upper limit for the shared memory multi-threaded model because of all of the cache synchronization issues, and to go beyond that, you will need to go to clusters with communication between local processor memory.
But the paltry level of processors you are getting now is probably OK for the OS running multiple apps with less processor contention, but to explicitly optimize any given app for thread parallelism, seems like a lot of effort for what, factor of 2 gain? Perhaps less?
I'd say it's like: There are thousands of restaurants in the city... but you can't just choose a different one each time you go out.
You have to pick *ONE* restaurant, and put $10,000 on the tab.
Most people think of Python as procedural or object-oriented, but it actually has all of the tools required to do functional programming. And, being Python, the syntax is logical and easy to read.
Check out this series of articles for more info: http://www.ibm.com/developerworks/library/l-prog.html
The parent has the correct perspective on this whole issue.
Programmers tasked with writing another workflow app or writing another e-commerce website are not going to even think about the dynamics of paralell programming (and don't need to). The developers/engineers builting real-time robotic machinery will have been thinking about this since they were 16 years old.
MPI, pthreads and so on are really a poor way of doing parallel programming. The reality is that these languages are all simply serial languages with parallelism bolted on. What you really need to do is use a truely parallel language. Way back in 1990 I learned to program transputers using Occam which was parallel through and through. On that platform it was trivial to write pure parallel code and more to the point, you could write it in a very fine grained way which could easily be serialised to run on a smaller number of processors. In some ways it was similar to MPI but far more potent because of the built in support in the transputer architecture. It is very sad that in the intervening 20 years or so since the transputer was first invented, parallel programming has gone largely nowhere. Attempts at automatic parallelisation of serial code are doomed to failure and threading within serial languages is always going to be a blunt tool. Maybe in another 20 years we will be back where we were in the late 1980's.
"I have the attention span of a strobe lit goldfish, please get to the point quickly!"
to welcome our new Choice Overlord personally until I found that I misread.
Sorry.
A coworker I used to work with at a previous company told me once, "This place changes direction so frequently that we just go in circles and never get out of the forest."
You can't turn your back on progress, you have to pick an API and move forward. And, that's what is really going on right now in the professional world. Software engineers are picking what seems like the best solution to them, and learning how to implement it so they can move on with their projects and make the company money. Multi-threading is an important step forward in computing and cannot be ignored, and there will be a few dragged along kicking and screaming into the 21st century.
So clearly, we need more special features in compilers!
How about we give away 650MB of code comments -- we'll call it, uh, "programmer's commentary" -- with each copy? And an interactive "Hello, world"?
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
A 5 dimensional sphere is easy to visualize. The other two dimensions could be color and size, with the other three being the normal x y z coordinates.
People always assume that the extra dimensions are obscure and bizarre extension of space-time. They don't have to be. A dimension can be used for any variable you want. A dimension could be reflectivity of light, smell, fluffyness, firmness. hardness, etc.
Diamonds, for instance, are priced on a four-dimensional scale (carat, color, clarity, cut). Those dimensions have no relationship to space-time.
People always assume that dimension #4 is time and the other dimensions are, well, unknowable. That definitely is not the case.
I meant it's an iterative language not a declarative one, but no edit option to fix it ;-(
is not an excess of choice, is an excess of improvisation.
Long story short... now that hardware speed is not easily doubled every few years, the industry has found a 'simple' way to keep pushing the weel, duplicate cores!. Well, it turns out that after decades of ignoring the parallel programming demand from academics now they are trying to push the 'somewath parallel' mess thay are producing.
The problem is 'duplicated cores' != 'parallel programming', that's the problem.
What's in a sig?
Read the freaking comment. Yutz.
in parallel of course.
I for one welcome our new [noun] overlords. In Soviet Russia [noun] [verb]s you.
Basically, with any new interesting technology people try out many different approaches, and as the technology matures, a few of them will survive as the de-facto standards.
And I say that as a non-Roman.
If you like Python, you might also want to try Lua for FP. Unlike Python, it does feature tail call optimization and named functions work exactly the same as anonymous lambda-style constructs.
I agree and disagree here. The way I see it, we'll always need human intelligence to explicitly annotate the source to indicate where nondeterministic behavior is permissible (or rather, where it's NOT allowed). But on the other hand I think compilers of the future will all use runtime analysis to refine JIT compilation, and they'll be able to spot very obscure parallelism opportunities that might only arise due to the actual data used in the evaluation of the program.
And I say that as a solipsist!
choice is a way of life for programmers. Every thing I do as a programmer has at least a dozen different alternative approaches to the same problem. What makes a good programmer is the ability to select wisely from this array of possibilities, sometimes catering to convenience, other times optimising for speed & memory. Programmers aren't about the emit shrill shrieks when the spectre of choice rears its hydra-like visage. Then again, nothing like a good scream.
prepare the survey weasels.
He makes a point. The point is that the parallel programming paradigm is still in it's infancy, in that there are certain rules to follow in each of these API's and languages, but that most of them are variations of solutions that are still different enough from each other to grant them any sort of existence (otherwise they would not exist). This simply means the field is moving.
As certain languages gain and loose momentum, so will the libraries, but the reverse is also true. When parallel programming really offers the bang for the buck for the kid in the street, it will automatically pull ahead the language or API that is easiest to use, and the API will either benefit or be killed from bloat explosion. Right now, parallel programming offers marginally noticeable improvements, because only now are we moving into the multi-core era for mainstream users, and the hardware is not quite proving that it works significantly better. The reasons are partly software as the whole software stack needs to be rethought and OS'es don't do a great job at harnessing all that power in a practical way. It could very well be that this is not simply a matter of language or API, but a matter of OS architecture.
In any case, it's about architecture, and not about sociologists fantasizing about a theoretical dilemma.
With great power comes great electricity bills.
Choice overload is a problem, but only if you are actually faced with all the choices. It need not be that way. There are various kinds of parallelism and various angles to attack each. Some of the choices one has will not make a lot of sense for type of problem one is looking
at. So, categorizing can help.
Some technologies will be in rapid development, others will be no longer actively maintained, and yet others will be stable but actively maintained. This also affects which choices are good.
Then there's licensing. Depending on the task, closed-source or copyleft
licenses might not be acceptable.
Some of the solutions may be low-level, allowing programmers to build something matching their application out of the provided building blocks, where other solutions may focus on providing higher level constructs, ready to be used. Sometimes, these will match what you need, and sometimes, they won't.
I am sure there are other axes of differentiation. Setting requirements will narrow ones choices, as well as illustrate why choice is a Good Thing. If there were only a few choices, it is unavoidable that none of them would actually fit some sets of requirements.
Now, the thing is that categorizing the various solutions is not something that every potential user of the solutions has to do. Part of the work can be done by the developers of each solution. Presumably, the solution is developed because a satisfactory solution did not already exist. In my opinion, the developers _should_ list related work, compare their solution to it, and explain why they saw fit to develop their solution. This is a standard part of research.
Another part of the work is comparisons done by third parties. Some independent person would go and investigate a number of solutions, and provide a write-up of the requirements they assumed, the solutions they investigated, how these solutions fit their requirements, and what their overall impression of the solutions was (w.r.t. things like ease of setup, documentation, development status, etc.). This, too, is valid research. It should be published, so everyone benefits.
In the end, what you get to do when you need to pick a solution for parallel programming, is
1. Define your requirements
2. Get a list of possible solutions
3. See what has been written about them
4. Check if that seems to be valid (it might be out of date, for one)
5. Possibly investigate any solutions that you found but that haven't been covered by others.
6. Decide which one to go with, based on the information you have gathered.
Sure, this is a far cry from
1. Find the only available solution
2. There is no step 2
but for that you are almost guaranteed to get a choice that better fits your requirements (you would be very lucky to have the only available solution be a great match), without having to pay the full cost of investigating every solution out there.
The thing to remember about the paradox of choice is that you will probably _feel_ less happy (there is always the nagging feeling that you could have made a better choice), but that you will generally end up with something _better_ than if the choice hadn't been there is the first place.
If you _really_ aren't happy about having to choose, you can always pick one (say, at random) and pretend that was your only choice. I conjecture that this is what the situation of having only one option is really like.
Please correct me if I got my facts wrong.
Seems to me to be a philosophical problem. With a single CPU you have a single gate processing a single sequence of instructions. It's easy to push the instructions and data through that gate in order. When you have 10, 20, 100 gates choosing which direction to push the instructions and data becomes exponentially more complex.
The solution it would seem to me would be to start pulling the instructions and data through the gates instead of pushing it.
Deleted
There's one very simple solution to this problem of choice; write your own. After all, I don't think one even exists for Piet yet.
Parallel programming has many facets. Libraries like OpenMP can not be compared to Windows threads, for example: the former is a mechanism for doing task/data parallelism at language level, the latter offers the primitives for making threads. PThreads is a thread API similar to win32 threads. I don't see how there is an overload of choices, really.
This is a nascent area and so of course there is lots of choice and lots of different thoughts of how best to support parallel development.
In due course there will be consensus and the few winners will emerge.
Same as it ever was.
This is the problem I have with Java. Everyone and their dog comes out with some new freakin' framework each month. What is the latest Java craze? The opposite, is also one of the things I liked about C and C++... the lack of a framework of the month syndrome, that is. Sure there was less 're-usability' going on with those. But nearly everyone could go into a C or C++ program and debug it. With Java you have to find people with expertise in your flavor of framework to debug it. And IN DEPTH EXPERTISE, not just how to use it. And you weren't forced to learn a new framework each month, or use a new framework because all the 'new framework because it's a new framework' chasers have moved on. How many freakin frameworks are there out there for Java. Too many. Too many to choose because you can't be sure what people will stick with. From what I see they don't stick with much. How about more consistency and less abstraction. We can abstract if we want to.
Programming is plagued with the choice of languages/frameworks for sequential programming. There are just too many to list. I'm kind of envious of the guy who light-heartedly chose Fortran for his parallel tasks. May be this strategy works for sequential programming too!
If all you've got is a hammer, you start to see every problem like a nail.
In parallel!
MPI and OpenMP standards aren't mutually exclusive either, you can use OpenMP intranode and MPI internode. Nearly the others are implementation-defined academic/research projects whereas MPI and OpenMP are used in the "real world" and supported by many vendors (though erlang sees real use in industry too), it's not suitable for most kinds of parallel work where high numeric performance matters too).
please, someone take over from here to correct the grammar.
Blah blah sig blah blah blah irony blah blah
I always find it amusing (in a sad kind of way) how people talk about Herb Sutter's "call to action" over this. It's not that I've got anything against Herb himself: he's a decent writer, an excellent speaker, and a guy who can use the word "expert" legitimately in areas like C++. But it's also not like he's the first guy to notice that modern desktop computer architectures have been heading for parallelisation rather than increased speed for several years now.
Despite being right in the thick of this culture shift myself — I'm sure I'm not the only one here who has been talking about this for a while, and is just seeing management catch up — I don't think this is going to be that big of a deal for most people. The harsh reality, for the buzzword-wielding consultants rubbing their hands with glee at a new programming approach they can hype up, is that most people just don't need all this.
Your average desktop PC is more than powerful enough for most things that most people do with it: Internet communications, writing documents, working with databases, shop floor software, and the like. As long as the operating system is reasonably smart about scheduling, the guys writing these common types of applications don't really have to know anything about multithreading, locking, message passing, and all that jazz. Similarly, your average mobile device has more than enough juice to dial another phone, write a quick e-mail, or capture a digital photo.
At the other end of the spectrum, serious servers (database, communications, whatever) have been dealing with parallel processing of many requests since forever. High-end systems doing serious maths (the guys modelling weather systems, say) have also been using massive parallelisation on their supercomputers for zillions of years now.
There is a gap between these different areas, which we might traditionally have called the "workstation" market: the guys doing moderate number crunching for CAD, scientific visualisation, simulations, and the like. Many modern games also fall into this classification. This market is ripe for a parallel processing revolution, because historically it hasn't followed this approach very much because the hardware wouldn't really take advantage of it, yet the extra power is genuinely useful. But I don't think this represents some huge proportion of the software development industry as a whole. The guys working in these areas tend to be pretty smart, and will no doubt adopt useful practices and conventions fairly quickly now that the hardware has reached the point that they are useful.
As to what those conventions are, I just don't buy the whole "choice overload" theory. There are relatively few basic models for parallel processing: for example, you can have no shared state and communicate only through message passing, or you can have shared state. In the latter case, you then have the question of how to make sure that the sharing is safe, which leads to lock-based or lock-free approaches. Funky toys like transactional memory run at a slightly higher level than this, but they are ultimately constructed from the same building blocks, and again there are only a small number of approaches at this level to consider.
I'm not familiar with all of those libraries mentioned in the story, but I'll bet that those three classifications (no shared state, shared state with explicit locking, shared state without explicit locks) probably cover the models used by most if not all of them. If you understand the trade-offs in those, you can produce a sensible design, and then the toolkit or framework you use to code it up is mostly just an implementation detail. Given that the trade-offs are pretty obvious and will often steer projects clearly in one direction, I don't think there's really that much to choose at all.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
The overload is just a symptom if the real problem and that is that parallel programming is just plain hard. We've had these issues for over a decade and we haven't seen a step function in use of parallel programming. It is difficult for most people to think of many things happening at the same time and to design and debug this class of program. We tend to start by thinking of a task in serial steps and then look for ways to add a little parallelism.
The folks who are low level systems programmers (OS and networks) tend to be folks who have an aptitude for thinking about parallelism and designing with parallelism in mind. There are a group of people in the scientific space who make use of parallelism, but then again they are Phd mathematicians and physicists. After that it drops off rapidly.
Maybe it has something to do with he way we are educated. perhaps it is a more fundamental issue of brain wiring. After all, we c perform complex physical tasks in parallel, but maybe only a small segment of the population is wired to think about programming problems in parallel.
The chip guys are throwing more cores at us and we can't create the software to fully utilize the hardware due to this issue. Perhaps it is time to take a step back and to stop trying to solve the problem by throwing more and different programming packages at the problem and examine why folks have so much trouble in this area.
One size will never fit all in parallel computing. There's such a diversity of range (hardware), domain (application space), and user need that a plethora of parallel tools and design choices will always be needed.
SMP machines differ significantly from tightly-coupled clusters, or even loosely-coupled clusters (a la Google), so a range of parallel tools are needed. Some parallelism components inevitably come from language vendors and others come from O/S vendors. As such, programmer access to these tools may need to remain at a low level (perhaps for direct device access) or may rise to a higher level as part of a more abstract programming model.
What's more, tasks go parallel for different reasons: greater speed, greater memory, more responsive GUIs, or to make for a software architecture that's better suited for various data structures like workpiles or trees, etc.
With such a diversity of architectures, multilevel implementations, application domains, and designer needs, it's little wonder that different solutions have emerged. Given the four dimensions and 2-3 variations that I've described for each, I can see justification for at least 2^4 different parallel programming models, tools, or languages.
A final thought. If each of the popular programming languages were extended to support even one form of parallelism (perhaps SMP threading), in addition to the numerous parallel process libraries that exist, how many *more* solutions would then arise? Can we really expect language implementors not to extend their products in order to serve the needs of their current users?
I think it's inevitable: the babel of parallel polyglots is here to stay.
Randy
fork(), pthread_create(), and MPI_Init()
Choice Overload in...
... and the list goes on and on. If we aren't careful, the result could very well be a 'choice overload' experience with humans running away in frustration."
... and the list goes on and on. If we aren't careful, the result could very well be a 'choice overload' experience with programmers running away in frustration."
... and the list goes on and on. If we aren't careful, the result could very well be a 'choice overload' experience with customers running away in frustration."
;)
Languages in general:
"And then we show them the languages they can talk in: English, Spanish, French, German, Italian, Japanese,
Programming:
"And then we show them the programming environments they can work with: Eclipse, Gnats, IDLE, Visual Studio, JBoss,
Eating:
"And then we show them the fruits they can eat: Bananas, Oranges, Peaches, Melons, Apples,
(Polemic:)Are these social psychologist Windows users and citizens of a dictatorship? I guess so...
But there is another explanation -- less polemic.
Its a fundamental problem with "Choice". Combine that with a industrial topic and you got a lot of attention and funding
So do you want to go to a bar with only one woman in it? Or a bar with 20? If you believe the premise of "choice overload" making you unhappy, you should choose the first, right?
Aside from that in many contexts the "choice overload" hypothesis is flat-out wrong (unless you really can claim to have felt a special rush of happiness just when you went into that bar with only one dame in it - and she wasn't, say, your girl already), there are open questions about how representative the test sample was. Psychological problems run culturally in certain populations. How can we be sure that the population tested for "choice overload" didn't share a psychological problem regarding choice that has no foundation is basic human psychology, but rather was relative to their own cultural limitations?
For most people in most cultures over history, the trick is to be happy with not much choice. That's generally the case for the working class, for the infantry soldiers, and for tribal peoples in environments of scarcity. Yet even in those cultures there are other classes for whom the trick is to be happy with a great deal of choice - the upper class, the generals, and tribal peoples in environments of plenty. Those whose cultures and religions derive primarily from desert (scarcity) environments are those driven craziest by "choice overload" - thus the Islamic meltdown, and the rejection of modern freedom by American fundamentalists. But we do have other cultures here. And the studies associated with the "choice overload" hypothesis, do not, I'll bet dimes to dollars, correct in any way for (sub)culture and psychological diagnosis.
"with their freedom lost all virtue lose" - Milton
What the 2000 paper is calling "choice overload" was already identified nine years prior by Douglas Coupland as "option paralysis". He defined it as "the tendency, when given unlimited choices, to make none.
Not a scientific study, but still...
(More Couplandisms)
- JJ
The same goes with the API's/languages. Which one is better? Well it depends on the problem, Fortran 95 might be good for most things, but some little known langauge might blow it out of the water for your particular problem/hardware. What is worse for HPC developers, is the "best language" can change quickly. Say you go with a Intel x86 solution, but a couple years later you get a huge grant from Sun to make a SPARC cluster. Everything changes. register, RAM, bus latecies and bandwidths, supported hardware operations, etc. When big government labs consider HPC systems, price tag is usually the deciding factor. The developers are stuck finding, and learning the best language for the hardware that they get given.
The good news I think is that within the next 5 years, we will see the majority of the parallel programming being targetted for desktop/production server markets. Here the systems will be a lot closer to similarities (probably x86, probably DRR or whatever the standard RAM is at the time etc). It will become shaping the language to the logical problem to be solved not the hardware, and I think a lot of the big HPC company sponsored languages won't gain a foothold in the market, or those companies won't bother releasing x86 versions. The clutter will clean itself up a bit.
In the long term, I'm looking forward to a good parallel compiler (some exist already but they don't work that well). IMHO the compiler should make the decisions on how to order the instructions for the hardware that it is compiled on, rather than the developer have to figure it out. Unfortunately Intel/AMD have been pretty closed about the innards of their hardware, this would have to change. We need to not only know that an instruction exists, but how many clock cycles to execute, latencies to other components within the socket, etc, so that we can make intelligent decisions on how to either code in a parallel language, or write a compiler to make those decisions.
Programming is done to create a product, not write in a specific language.
Choice is a good thing, for each language has it's own pros and con's that need to be weighed when used.
Creating a one off piece of custom satellite firmware has different needs and priorities then firmware for a video card. Banking software is different then automotive software.
The Kruger Dunning explains most post on
all the choices listed are distinctly different tools for different jobs. Whether that job is run in one SMP system, or one NUMA coupled system, or many networked system...whether mutual exclusion is needed (and whether referenced frequently or rarely), whether IPC needs complex message queue management or just a single semaphore....anyone who thinks the number of choices is an issue doesn't know much about the subject
Thanks for mentioning ZPL, I'm surprised that I missed such a nice parallel programming language previously.
But is the ZPL research and development ended? The alpha release dates back to 2004, and the developer release to 2005.
With all these choices of programming platforms, and all the knowlegeable programmers out there (as demonstrated by all the posts on this topic), then why is most of the software produced, both closed and open source, basically bug-ridden, bloated, resource hogging, memory leaking crap ?
Funny then how my web browser glitches up on my dual-core system exactly like it did on my old single core, and still doesn't manage to put more than a 30% load on the whole CPU. This isn't a case of one thread doing so much that others don't have a chance to run, it's just some engineer being bad at not synchronizing everything the worker threads do.
While many desktop applications can benefit from a multi-threaded design, it really isn't the same thing as parallel programming. The need to use threads has nothing to do with multiple cores suddenly being available, and everything to do with the fact that certain API calls block, certain operations can be done in the background, and that pure UI threads rarely use their entire time slice - something that's been true for a long time.
Here, read some. These are benchmarks comparing Java5 to C++
http://www.idiom.com/~zilla/Computer/javaCbenchmark.html
Java has improved a lot since its early versions and now it's comparable to the best platform specific compiled languages out there.
Check out my blog!
See subject.
Too much choice is counterproductive, and hurts your product?
Who knew?
Depending on the implementation language, making tasks parallel that are logically independent may not add any additional time or money. Sure, it adds a lot if you are adding it to an existing app written in a language that doesn't accommodate it naturally, but GP referred to writing applications, not rewriting existing applications. The point at which it becomes sensible to apply parallelism widely to new applications isn't the same point at which becomes sensible to rewrite existing applications to incorporate parallelism.
Here's a simple example on how to get a video rendering distributed over 4 processors from a single command line:
capture-frame | scale-and-adjust-brightness | apply-effects | compress-to-jpeg > frame-00001.jpg
Each of the above commands could be simple one line scripts that encapsulate the required utilities (which in this example would be ImageMagick or netpbm calls). You can also distribute the processing over a cluster using rsh or ssh quite easily, with the pipe'd data flowing over the network connections...
Unix gave us the solution for parallel programming over 30 years ago (at least for certain types of high computational batch processing operations) and the wheel keeps getting reinvented over and over and over again. By learning a bit of shell script you can easily take advantage of the 4+ core CPUs that are coming out in the future without installing other than a linux distribution.
Listen to my music.
All of the parallel programming paradigms we've seen so far just hurt my head in a 'but but that's not the Right Thing' kind of way. I'm not sure what the Right Thing to do is, but I'm sure multithreading ain't it, and even Haskell strikes me as overcomplicated for what ought to be a very simple solution if we just looked at it a little differently.
I think we need immutability and instruction-level parallelism at least, just to get halfway sane, and then a few new abstractions on top of that. The idea of 'views' is something that keeps circling around my head, and I'm trying to tease my brain into explaining what it means on my Livejournal - but as a first cut, I think Occam and Carl Hewitt's 'Ether' (and to a lesser extent, the Japanese Fifth Generation Project's KL-1) were heading toward the right track.
Of course, I don't have a formal CS background, so I may be just gibbering incoherently, but words are cheap and this stuff sure is fun to dream about.
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
fixed link
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
Neither the idiom.com link nor you get it.
Java is perceived as slow because it starts up slow.
All those fancy benchmarks aren't including startup times, which is what the user sees as slow.
Humans are all about first impressions, and startup time for a java app IS the first impression.
And if its Java on the web wow does is take painfully long for the JVM to load.
They ARE out to get you simply because They are in it for themselves and they don't care about you.
That list doesn't make sense. In the first place, most of the entries are not so well-known as to really deserve a mention. In the second place, some are languages (e.g., Java) and others are APIs provided by operating systems (e.g., Windows threads). Code written in Java might use Windows threads when run on the Windows platform, or the same code might use some other underlying mechanism when run on another kind of platform, which really after all is the major point of using a platform-independent language.
;-)
Also, I can't believe POE was left off the list
Cut that out, or I will ship you to Norilsk in a box.
You've got it back asswords.
It's:
1) Make a cheaper clone of someone else's higher quality product, and charge only %15 percent less.
2) Make a cheaper yet clone of someone else's quality product, and charge half.
3) Profit. Most of the market is confused by all the products that "do the same thing, but cost twice as much" so "high quality" becomes "they must charge more for the brand name". Brand names get the margins, mass market manufacturers get the glory.
...
4) Taken aback by all this, you swig the last of your "Mountain Lightning" and slam down the lid on your Inspiron laptop. Using the dim glow of your Magnavox HDTV you make your way to bed, deftly dodging piles of debris. You rest your head on a warm heap of soiled clothes, and let the humming drone of your Xbox 360 lull you to sleep. "In a couple of weeks, I'll have enough money to buy Halo 3", you quietly console yourself.
Good Night.
Why this? Simply for one reason given in the article: cache misses. A java class comes with a 8 byte per instance overhead (16 bytes if it can be serialized), which means that less stuff will fit in the cache. Yes, a non-conservative garbage collector can improve locality of the data by putting things together in ways that C never can, but especially for numerical stuff, it will effectively halve the cache size or worse by insisting on carrying runtime type information everywhere. Ever compared the speed/memory of a Complex class with manually fiddling with the real and imaginary parts? The difference is shocking.
Until Java gets some decent structure support (a collection of primitive data items without any overhead, plus preferably operator overloading on just such structures), Java for high-performance numerical computation will be stuck in the pre-C days of abstraction.