Is Parallel Programming Just Too Hard?

← Back to Stories (view on slashdot.org)

Is Parallel Programming Just Too Hard?

Posted by kdawson on Monday May 28, 2007 @04:24PM from the Moore's-Law-for-software dept.

pcause writes "There has been a lot of talk recently about the need for programmers to shift paradigms and begin building more parallel applications and systems. The need to do this and the hardware and systems to support it have been around for a while, but we haven't seen a lot of progress. The article says that gaming systems have made progress, but MMOGs are typically years late and I'll bet part of the problem is trying to be more parallel/distributed. Since this discussion has been going on for over three decades with little progress in terms of widespread change, one has to ask: is parallel programming just too difficult for most programmers? Are the tools inadequate or perhaps is it that it is very difficult to think about parallel systems? Maybe it is a fundamental human limit. Will we really see progress in the next 10 years that matches the progress of the silicon?"

11 of 680 comments (clear)

Min score:

Reason:

Sort:

Re:our brains aren't wired to think in parallel by Anonymous Coward · 2007-05-28 16:36 · Score: 5, Informative

I do a lot of multithreaded programming, this is my bread and butter really. It is not easy - it takes a specific mindset, though I would disagree that it has much to do with management. I am not a manager, never was one and never will be. It requires discipline and careful planning.

That said, parallel processing is hardly a holy grail. On one hand, everything is parallel processing (you are reading this message in parallel with others, aren't you?). On the other, when we are talking about a single computer running a specific program, parallel usually means "serialized but switched really fast". At most there is a handful of processing units. That means, that whatever it is you are splitting among these units has to give itself well to splitting this number of ways. Do more - and you are overloading one of them, do less - and you are underutilizing resources. In the end, it would be easier often to do processing serially. Potential performance advantage is not always very high (I know, I get paid to squeeze this last bit), and is usually more than offset by difficulty in maintenance.
Re:Two words: map-reduce by allenw · 2007-05-28 17:15 · Score: 5, Informative

Implementing MapReduce is much easier these days: just install and contribute to the Hadoop project. This is an open source, Java-based MapReduce implementation, including a distrbuted filesystem called HDFS.
Even though it is implemented in Java, you can use just about anything with it, using the Hadoop streaming functionality.
Re:Nope. by Anonymous Coward · 2007-05-28 18:00 · Score: 4, Informative

Functional programming is no harder than procedural/OO programming. A good functional interpreter can already draw this kind of parallelism out of a program. And yes, there are compiled functional languages with parallelizing compilers. (Erlang and OCaml come to mind)
Re:Nope. by Durandal64 · 2007-05-28 18:30 · Score: 3, Informative

Multi-threaded programming is very difficult. But some things just shouldn't be done on multiple threads either. Multi-threading is a trade-off to get (generally) better efficiency and performance in exchange for vastly more complex control logic in many cases. This greater complexity means that the program is much more difficult to debug and maintain. Sometimes multi-threading a program is just a matter of replacing a function call with a call to pthread_create(...). But sometimes a program just can't be multi-threaded without introducing unacceptable complexity. A lot of the complaints of difficulty in multi-threading comes from people trying to multi-thread programs that can't be easily changed.
Re:A different approach to parallel programming by jlarocco · 2007-05-28 18:50 · Score: 4, Informative

I've often wondered if parallel programming would be easier if it were done in Chinese characters instead of English/European alphabetical characters.

Perhaps having a Chinese character represent a simple block of pre-compiled code that does one simple thing. Then the characters could be placed in two-dimensional order to form parallel threads. This would require a completely different approach to compiler development. But that would be OK because compilers are stuck in the 1970s anyway.

Maybe I'm a "bit thick", but that doesn't make any sense to me. It's an interesting idea, but I just don't see how it'd help parrallelize things. At the very least, it seems to be solving the wrong problem.

The biggest problem right now is that it's really hard to split most tasks into parts that can be performed at the same time. Once a parrallel algorithm is devised, it's relatively easy to write a program that performs the task in parrallel.

Also, I don't know what you mean about compilers being stuck in the 70s. There have been massive improvements to compilers in the last 40 years.

Just getting away from the idea of having code based on a very limited set of alphanumeric characters strung together like beads on a string might help unlock a whole new era of innovative approaches to parallel program development strategies.

But programming doesn't work like that. Individual characters in a programming language are almost irrelevant.

--
Maybe not
Lack of skill in the field by node159 · 2007-05-28 22:02 · Score: 5, Informative

I've seen it over and over in the industry, there is a distinct lack of parallel programming skill and knowledge, and even senior developers with years under their belt struggle with the fundamentals.

Parallel programming does add a layer of complexity and its inherent lack of general solutions does make abstracting its complexity away difficult, but I suspect that the biggest issue is the lack of training of the work force. It isn't something you can pick up easily without a steep learning curve with many hard lessons in it, definitely not something that can be incorporated as a new thing to be learnt on the fly with deadlines looming.
Another aspect is that its fundamental to the design, parallelism can and often will dictate the design and if the software architects are not actively designing for it or are not experienced enough to ensure that it remains a viable future option, future attempts to parallelise can be difficult at best.

Ultimately the key issues are:
* Lack of skill and training in the work force (including the understanding that there are no general solutions)
* Lack of mature development tool to easy the development process (debugging tools especially)

--
GPLv2: I want my rights, I want my phone call! DRM: What use is a phone call, if you are unable to speak?
Re:Nope. by stony3k · 2007-05-28 23:09 · Score: 3, Informative

The truth is languages such as C/C++ and Java(to lesser extent) are not good languages to write parallel code in.
To a certain extent Java is trying to correct that with the java.util.concurrent library. Unfortunately, not many people seem to have started using it yet.

--
Freedom is not worth having if it does not include the freedom to make mistakes. - Mahatma Gandhi
Re: Nope. by Dolda2000 · 2007-05-28 23:16 · Score: 3, Informative

It is definitely hard to justify parallel programming, even though many computers are gaining SMP capabilities. The thing isn't necessarily that it is particularly hard to write multi-threaded applications -- the thing is that it is a lot harder to write a multi-threaded program than to write a single-threaded program. Suddenly, you have to introduce locks in all shared data structures and ensure proper locking in all parts of the program. That kind of thing just adds a significant part to the complexity of a program, and it requires a lot more testing as well. Therefore, justification is definitely needed.
The real question then, is: Is it justified? To be honest, for most programs, the answer is no. Most interactive programs have a CPU-time/real-time ratio of a lot less than 1% during their lifetime (and very likely far less than 10% during normal, active use), so any difference brought by parallelizing them won't even be noticed. Other programs, like compilers, don't need to be parallelized, since you can just run "make -j8" to use all of your 8 cores at once. I would also believe that there are indeed certain programs that are rather hard to parallelize, like games. I haven't written a game in a quite a long time now, and I don't know the advances that the industry has made as of late, but a game engine's step cycle usually involves a lot of small steps, where the execution of the next depends on the result of the previous one. You can't even coherently draw a scene before you know that the state of all game objects has been calculated in full. Not that I'm saying that it isn't parallelizable, but I would think it is, indeed, rather hard.
So where does that leave us? I, for one, don't really see a great segment of programs that really need parallelizing. There may be a few interactive programs, like movie editors, where the program logic is heavy enough for it to warrant a separate UI thread to maintain interactive responsiveness, but I'd argue that segment is rather small. A single CPU core is often fast enough not to warrant parellelizing even many CPU-heavy programs. There definitely is a category of programs that do benefit from parellelization (e.g. database engines which serve multiple clients), but they are often parellelized already. For everyone else, there just isn't incentive enough.
Re:I blame the tools by ensignyu · 2007-05-29 01:45 · Score: 3, Informative

FYI, that programming model is called futures.
Multics by Mybrid · 2007-05-29 03:21 · Score: 3, Informative
Multics was the precursor to Unix. Multics was a massively parallel operating system. It was designed to be an OS for a public utility. The thinking back then was all Americans would have computer terminals like a vt100 and plug into a municipal computer over a network.
Multics was designed for long lived processes. Short lived processes are something we take for granted today but wasn't assumed back then. Today we assume that the sequence is we open a program, perform a task, close the program. Microsoft Outlook, for example, relies on Outlook being closed for its queue when to purge email that's been deleted. Programs are not designed to be up years-on-end. In fact, Microsoft reboots their OS every month with Windows Update. I've often speculated that the reboot is often requested not because the patch requires it but because Microsoft knows that its OS needs rebooted, often.
Why? Why wouldn't one just leave every application you've ever opened, opened?
The reason is that programmers cannot reliably write long running process code. Programs crashed all the time in Multics. Something Multics wasn't very good at handling back in the 1960s. There was some research done and it was observed that programmers could write code for short lived processes fairly well but not long lived.
So, one lessoned learn from from the failure Multics is that programmers do not write reliable, long running code.
Parallel processing is a processing better suited to long running processes. Since humans are not good at writing long running processes it makes sense then that parallel processing is rare. The innovation to deal with this sticky dilemma was the client-server model. Only the server needs to be up for long periods of time. The clients can and should perform short lived tasks and only the server needs to be reliably programmed to run 24/7. Consequently you see servers have clusters, RAID storage, SAN storage and other parallel engineering and clients do not. In some sense, Windows is the terminal they were thinking of back in the Multics days. The twist is that given humans are not very good at writing long running processes then the client OS, Windows, is designed around short lived processes. Open, perform task, close. I leave Firefox open for more than a couple of days and it is using 300MB of memory and slowing my machine down. I close and reopen Firefox daily.
Threads didn't exist in the computing word until Virtual Memory with it's program isolation came to be. What happened before threads? What happened before threads is that programmers were in a shared, non-isolated environment. Where Multics gave isolation to every program, Unix just recognizes two users: root and everyone else. Before Virtual Memory, this meant that all user programs could easily step on each other and programs could bring each down. Which happened a lot.
Virtual Memory is one of the greatest computing advances because it specifically deals with a short coming in human nature. Humans are not very good at programming memory directly, i.e. pointers.
It wasn't very long after VM came out that threads were invented to allow intra-application memory sharing. Here's the irony though. There still as no advancement in getting humans to perform reliable programming. Humans today are still not very good at programming memory directly, even with monitors, semaphores and other OS helpers.
When I was in my graduate OS class the question was raised then of "when do you invoke a thread?" given you probably shouldn't to avoid instability.
The answer was to think about threads then as "light weight processes". The teaching was that given this a thread was appropriate for:
1. IO blocked requests.
  Have one thread per IO device like keyboard, mouse, monitor, etc. There should be one thread dedicated to CPU only and the CPU thread controls all the IO threads. The IO threads should be given the simple task of servicing requests on behalf of the CPU thread.
2. Performance
  Onl
Re:Nope. by poopdeville · 2007-05-29 05:17 · Score: 3, Informative

Well, Lisp doesn't look like XML. But its s-expressions are able to deal with tree like structures just as easily. If you have some free time, I suggest reading http://www.defmacro.org/ramblings/lisp.html.

--
After all, I am strangely colored.