More Interest In Parallel Programming Outside the US?

More than a trend, it's a necessity by Cordath · 2008-03-24 20:01 · Score: 4, Insightful

Parallel programming is going to go mainstream, not because people find it interesting, but because that's the way hardware is forcing us to go. First mainframes, then workstations, and now desktops and laptops have moved away from single CPU cores. In every case, it has been a necessary evil due to the inability to pack enough power into a single monolithic processor. Does anyone actually think Intel, if they could, wouldn't happily go back to building single-core CPU's with the same total power as the multi-core CPU's they're making now?

Right now, parallel development techniques, education, and tools are all lagging behind the hardware reality. Relatively few applications currently make even remotely efficient use of multiple cores, and that includes embarrassingly parallel code that would require only minor code changes to run in parallel and no changes to the base algorithm at all.

However, if you look around, the tools are materializing. Parallel programming skills will be in hot demand shortly. It's just a matter of time before the multi-core install base is large enough that software companies can't ignore it.

Reinders Is Wrong: Threads Are Not the Answer by MOBE2001 · 2008-03-24 20:07 · Score: 5, Insightful

One day soon, the computer industry will realize that, 150 years after Charles Babbage came up with his idea of a general purpose sequential computer, it is time to move on and change to a new computing model. The industry will be dragged kicking and screaming into the 21st century. For over 20 years, researchers in parallel and high-performance computing have tried to come up with an easy way to use threads for parallel programming. They have failed and they have failed miserably. Amazingly, they are still continuing to pursue the multithreading approach. None other than Dan Reed, Director of scalable and multicore computing at Microsoft Research, believes that multithreading over time will become part of the skill set of every professional software developer (source: cio.com). What is wrong with this picture? Threads are a major disaster: They are coarse-grained, they are a pain in the ass to write and hard to debug and maintain. Reinders knows this. He's pushing threads, not because he wants your code to run faster but because Intel's multicore CPUs are useless for non-threaded apps.

Reinders is not an evangelist for nothing. He's more concerned about future-proofing Intel's processors than anything else. You listen to him at your own risk because the industry's current multicore strategy will fail and it will fail miserably.

Threads were never meant to be the basis of a parallel computing model but as a mechanism to execute sequential code concurrently. To find out why multithreading is not part of the future of parallel programming, read Nightmare on Core Street. There is better way to achieve fine-grain, deterministic parallelism without threads.

Re:Reinders Is Wrong: Threads Are Not the Answer by Anonymous Coward · 2008-03-24 20:48 · Score: 2, Insightful

The theory of fine-grained parallelism is fundamentally flawed by the fact that parallelisation itself incurs an overhead, due to locking and syncing shared data structures.

Your COSA stuff has already been investigated by researchers before. Basically, describing functional programming in a graph doesn't achieve anything. And if you want to parallelise like this (which is still nowhere near as efficient as hand-optimised coarse-grained parallelisation):

Core 1 Core 2

(2*3) + (4*6)
(6+24)

you'll probably have to wire the registers between cores directly into each other in order to avoid the enormous overhead of an external chip.
Re:Reinders Is Wrong: Threads Are Not the Answer by dkf · 2008-03-24 22:31 · Score: 4, Insightful

Fine-grained parallelism works fine. It works in your NVidia, SIMD-based graphics coprocessor, does it not? Locking and syncing is a problem only in a non-deterministic environment like multithreading. Fine-grained parallelism is temporally deterministic because the temporal (concurrent or sequential) order of code execution can be precisely determined. It really really depends on the problem and the algorithm. Some things are easy to parallelize, especially if they don't need (much) shared writable memory, but others are furiously hard.
you'll probably have to wire the registers between cores directly into each other in order to avoid the enormous overhead of an external chip.

Not really. There is a way to design a multicore processor such that only neighboring cores cooperate on related computation. It is part of the self-balancing mechanism. I can't go into detail but suffice it to say that if you keep your inter-core communication performance penalty at a fixed level regardless of the number of cores, you have a winner. As I said above, it really depends on what you're doing. Some classes of problems just don't and can't have nice communication patterns, and if you've got one where you've got these inherent non-local effects, no amount of cleverness is going to let you avoid the hard fact that communication costs will dominate them. Other problems are much more tractable though; it's definitely not all doom and gloom. Just don't sound off and claim that it's all solved (nope!) or that some simple hardware-level cleverness will save us (nope again!) but instead study what's really known so that you can sound more knowledgeable. A good place to start reading up is with the thirteen dwarfs paper (PDF).

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:Reinders Is Wrong: Threads Are Not the Answer by Anonymous Coward · 2008-03-24 22:55 · Score: 1, Insightful

Bollocs.

Multithreading is a solved problem (as programming is). The only thing that needs fixing with threads is the lack of cheap developers. But that is solvable too, using the Actor Model (look it up on Wikipedia, I'm too lazy to do it for you.) In short: don't share variables, use synchronized queues instead, and your threading problems vanish (at a price, of course.) You can do it in anything from assembler to Java, but some languages like Erlang make it damn easy.

Go out and learn Erlang or Scala.
Re:Reinders Is Wrong: Threads Are Not the Answer by Anonymous Coward · 2008-03-24 23:07 · Score: 1, Insightful

There does not have to be a penalty if neighboring cores share registers or even caches. The trick is to keep computations local to neighbors only. It can be done. It's a matter of instruction scheduling.

That's one hell of an assumption. Even then, you'd still be stuck with one bank of memory, seeing as this would not be feasible over a large-scale system. As soon as you have processors on different boards, attached to different banks of memory, you're going to have an absolutely massive overhead.

Ultimately, physics will solve the Von Neumann bottleneck.

Again.. one hell of an assumption. If that happens, many things in computer science will have to be completely re-thought.

Re:Duh? by dubl-u · 2008-03-24 20:24 · Score: 2, Insightful

"Science progresses one funeral at a time." - Max Planck

Software might be slightly better, as Moore's Law has been prodding us forward. On the other hand, given the number of us working in C-like languages (35+ years old), maybe with an OO twist (25+ years), to do web stuff (15-ish years), one funeral at a time might be more than we can manage. Legacy code, alas, can outlive its authors.

Re:Experince by Instine · 2008-03-24 20:46 · Score: 1, Insightful

but just wait...

If you think 2 cores is tricky, then how about 4. And if you're really going to make the most of the multiple cores, and you start to use them for complex permutations of solution finding, the complexity gets silly fast! 8 cores in total is already fairly common. This will have to more than double every 18 months to keep up with Moore's 'Law', which they have to do, to keep the mechanics of their capitalist framework ticking smoothly.

In ten years, efficient programming won't be difficult, it will be impossible unless we evolve our engeneering concepts dramatically to adapt to this paradigm shift (sorry for the cliche phrase but its apt). I believe the only way, will be to use genetic algorithms (suited to multiprocessors them selves) to adaptively compile code. Effectively evolving it until its optimized.

--
Because you can - or because you should?

Parallel programming has been with us of years! by supersnail · 2008-03-24 20:52 · Score: 3, Insightful

One of the reasons more seasoned programmers are not particularly interested is that in most cases someone else has already doen the hard work.

Want to serve multiple user on multiple cpus with your web pages? Then write a single threaded program and let Apache handle the parallelism. Same goes for JavaEE, database triggers, etc. etc. going all the way back to good old CICS and COBOL.

It is very rare that you actually need to do parallel programing yourself. Either you are doing some TP monitor like program which schedules tasks other people have written in which case you should use use C and POSIX threads (anything else will land you in deep sewage) or you are doing serious science stuff in which case there are several "standard" fortran libraries to do multithreaded matrix math -- but if the workload is "serious" you should be looking at clustering anyway.

--
Old COBOL programmers never die. They just code in C.

Not So Great by yams · 2008-03-24 21:06 · Score: 4, Insightful

Been there, done that. Good from far, but far from good.

As an engineer straight out of college, I was very interested in parallel programming. In fact, we were doing a project on parallel databases. My take is that it sounds very appealing, but once you dig deeper, you realise that there are too many gotchas.

Considering the typical work-time problem, let's say a piece of work takes n seconds to complete by 1 processor. If there are m processors, the work gets completed in n/m seconds. Unless the parallel system can somehow do better than this, it is usually not worth the effort. If the work is perfectly divisible between m processors, then why have a parallel system? Why not a distributed system (like beowulf, etc.)?

If it is not perfectly distributable, the code can get really complicated. Although it might be very interesting to solve mathematically, it is not worth the effort, if the benefit is only 'm'. This is because, as per Moore's law, the speed of the processor will catch up in k*log m years. So, in k*log m years, you will be left with an unmaintainable piece of code which will be running as fast as a serial program running on more modern hardware.

If the parallel system increases the speed factor better than 'm', such as by k^m, the solution is viable. However, there aren't many problems that have such a dramtic improvement.

What may be interesting are algorithms that take serial algorithms and parallelise them. However, most thread scheduling implementations already do this (old shell scripts can also utilise parallel processing using these techniques). Today's emphasis is on writing simple code that will require less maintenance, than on linear performance increase.

The only other economic benefit I can think of is economy of volumes. If you can get 4GHz of processing power for 1.5 times the cost of a 2GHz processor, it might be worth thinking about it.

Parent is first reply that gets it... by Anonymous Coward · 2008-03-24 21:11 · Score: 2, Insightful

It's only been twelve years since I entered the workforce in the US, but I have been studying parallel programming for almost 15 years (three years in university).

The future isn't "multi-threaded" unless you count SPMD, because architecturally the notion of coherent shared memory is always going to be an expensive crutch. Real high-performance stuff will continue to work with distributed, local memory pools and fast inter-node communication... whether the nodes are all on chip, on bus, in the box, in the datacenter, etc.

As they have been since the 80s at least, many CS researchers will be trying to find the holy grail of programming models and tools to automatically parallelize larger classes of algorithms for naive programmers. And in the meantime, just as we have since the 80s at least, programmers will often go to bare metal and hand-optimize important libraries and applications that are too important to leave to the vagaries of the immature tools.

If there are fewer programmers in the US or Europe who worry about parallelism, it is only because the economy is such that they can still satisfy their customers without it. There are many parallel programmers here too, and maybe there isn't a pressing need for even more because their work is being reused. I'm not sure what sort of statistical analysis you should use to determine prevalence of parallel systems usage, but I am pretty sure it is not by counting programmer interest. How many deployed systems are there? How many CPU-hours of parallel work are being done? What fraction of IT budgets are supporting production use of parallel systems?...

To be frank, I think the vast majority of computer cycles are spent executing code written by a small minority of the programmers in the world. These are the ones that matter, as far as optimizing for parallel environments. Sorry if that sounds elitist, but it's a bit like trying to analyze the distribution of race car drivers by surveying the interests of all licensed motorists.

Re:Parent is first reply that gets it... by Anonymous Coward · 2008-03-24 21:28 · Score: 1, Insightful

I think you've nailed it. There are no stats quoted in the article. The US still has more than 50% of the world's top500 machines (and I wonder what that figure rises to if you were to measure the top10000?) and is this Dobb's Code Talk telling me that nobody is writing code specifically for those machines? Bullshit. I'm not an American, nor have I been to the US, but I bet the parallel programmers are far more numerous there than here.

Re:Experince by LiquidCoooled · 2008-03-24 21:23 · Score: 3, Insightful

Using genetic algoithms won't help unless you understand the underlying issues with multi-core.
Computer software is notorious for not understanding what the operator wants ("It looks like you are writing a sorting algorithm.."), what makes you think this will be any different?

(I am not knocking GA coding methods but just using it as a blanket extension to job security is misguided at best)

--
liqbase :: faster than paper

WTF question is this???? by Aceticon · 2008-03-24 22:39 · Score: 3, Insightful

Those of us doing server side development for any medium to large company will have already been doing multi-threaded and/or multi-process applications for ages now:
- When Intel was still a barely known brand, other companies were already selling heavy-iron machines with multiple CPUs for use in heavy server-side environments (didn't ran Windows though). Multi-cores are just a variant of the multiple-CPU concept.

The spread of Web applications just made highly multi-threaded server-side apps even more widespread - they tend naturally to have multiple concurrent users (<rant>though some web-app developers seem to be sadly unaware of the innate multi-threadness of their applications ... until "strange" problems start randomly to pop-up</rant>).

As for thick client apps, for anything but the simplest programs one always needs at least 2 threads, one for GUI painting and another one for processing (either that or some fancy juggling a la Windows 3.1)

So, does this article means that Japan, China, India and Russia had no multi-CPU machines until now ... or is this just part of a PR campaign to sell an architecture which, in desktops, is fast growing beyond it's usefulness (a bit like razors with 4 and 5 blades)

Oversimplified by Moraelin · 2008-03-25 00:00 · Score: 3, Insightful

That's an oversimplified view.

It's more like when you've got enough experience, you already know what can go wrong, and why doing something might be... well, not necessarily a bad idea, but cost more and be less efficient anyway. You start having some clue, for example, what happens when your 1000 thread program has to access a shared piece of data.

E.g., let's say we write a massively multi-threaded shooter game. Each player is a separate thread, and we'll throw in a few extra threads for other stuff. What happens when I shoot at you? If your thread was updating your coordinates just as mine was calculating if I hit, very funny effects can happen. If the rendering is a separate thread too, and reads such mangled coordinates, you'll have enemies blinking into strange places on your screen. If the physics or collision detection does the same, that-a-way lies falling under the map and even more annoying stuff.

Debugging it gets even funnier, since some race conditions can happen once a year on one computer configuration, but every 5 minutes on some hapless user's. Most will not even happen while you're single-stepping through the program.

Now I'm not saying either of that is unsolvable. Just that when you have a given time and budget for that project, it's quite easy to see how the cool, hip and bleeding-edge solution would overrun that.

By comparison, well, I can't speak for all young 'uns, but I can say that _I_ was a lot more irresponsible as the stereotypical precocious kid. I did dumb things just because I didn't know any better, and/or wasted time reinventing the wheel with another framework just because it was fun. All this on the background of thinking that I'm such a genius that obviously _my_ version of the wheel will be better than that built by a company with 20 years of experience in the field. And that if I don't feel like using some best practice, 'cause it's boring, then I know better than those boring old farts, and they're probably doing it just to be paid for more hours.

Of course, that didn't stop my programs from crashing or doing other funny things, but no need to get hung up on that, right?

And I see the same in a lot of hotshots nowadays. They do dumb stuff just because it's more fun to play with new stuff, than just do their job. I can't be too mad at them, because I used to do the same. But make no mistake, it _is_ a form of computer gaming, not being t3h 1337 uber-h4xx0r.

At any rate, rest assured that some of us old guys still know how to spawn a thread, because that's what it boils down to. I even get into disputes with some of my colleagues because they think I use threads too often. And there are plenty of frameworks which do that for you, so you don't have to get your own hands dirty. E.g., everyone who's ever wrote a web application, guess what? It's a parallel application, only it's the server which spawns your threads.

--
A polar bear is a cartesian bear after a coordinate transform.

Re:What are the applications? by Tony+Hoyle · 2008-03-25 00:11 · Score: 2, Insightful

Actually No. In that case you'd want to use a maximum of 3. The OS needs to do its own processing and if you're hogging all 4 cores your app will end up slower because every time it does something OS dependent like accessing a the disk or network it'll be waiting around for the OS to catch up.

Threads: Threat or Menace by martincmartin · 2008-03-25 00:21 · Score: 5, Insightful

It always surprizes me how many people say "we have to multithread our code, because computer are getting more cores," not realizing:

There are often other ways to do it, e.g. multiple processes communicating over sockets, or multiple processes that share memory.
Threads are hard to get right. Really, really hard.

When your library of mutexes, semaphores, etc. doesn't have exactly the construct you need, and you go to write your own on top of them, it's really, really hard not to introduce serious bugs that only show up very rarely. As one random example, consider the Linux kernel team's attempts to write a mutex, as descried in Ulrich Drepper's paper "Futexes are Tricky."

If these people take years to get it right, what makes you think *you* can get it right in a reasonable time?

The irony is that threads are only practical (from a correctness/debugging point of view) when there isn't much interaction between the threads.

By the way, I got that link from Drepper's excellent "What Every Programmer Should Know about Memory." It also talks about how threading can slow things down.

Re:Threads: Threat or Menace by Anonymous Coward · 2008-03-25 01:17 · Score: 1, Insightful

Threads are hard to get right. Really, really hard.

Hogwash. Feel free to keep believing that though while the rest of us write functioning multi-threaded code.

If these people take years to get it right, what makes you think *you* can get it right in a reasonable time?
The paper you link to is specifically talking about the Linux futex implementation. Futexes are a special case mutex, and apparently when they were introduced in the 2.5 kernel they were seriously broken: from a quick skim on the paper, whoever wrote them wasn't thinking too hard or had no idea what a spinlock was, for a start.

It may surprise you to learn that futexes are not required for the application developer: they are an implementation detail. In reality you only have one primitive to worry about, the counting semaphore. Everything else is a derivation of it: mutexes, futexes (which is really just a mutex) conditionals and all the other fancy "primitives" which arn't actually primitives, all boil down at the core to a single semaphore to control concurrency.

Those of you who were awake and paying attention during your expensive college courses should already understand critical sections. I'm sure you all have enough experience to understand why global variables are a bad idea. So why exactly should the idea of concurent programming suddenly seem hard?

By the way, plenty of other OSes managed to implement futexes and got them right long before the Linux developers suddenly woke up to the fact that threaded programming was important and they might need to actually support it. BeOS managed to provide a heavily multi-threaded implementation with working futexes from the early 90's. So the real question is, if the Linux developers can't manage it, are they as smart as you think they are?

It's been mainstream for years by Nursie · 2008-03-25 00:38 · Score: 2, Insightful

You just haven't noticed.

Multi process apps have been common in the business and server app space for almost two decades.
Multi thread apps have been common in the business and server world for a few years now too.

To all having the will it/won't it go mainstream argument: You missed the boat. It's mainstream.

Re:Nowt wrong with C and POSIX threads by Metasquares · 2008-03-25 01:42 · Score: 2, Insightful

Anything harder than a problem needs to be is too hard. The question is whether concurrency can be made easier while preserving the benefits it provides.

No issue with parallel programming anywhere. by sonofagunn · 2008-03-25 01:49 · Score: 2, Insightful

For your examples you threw out a bunch of problems that are currently being parallelized just fine by today's software, which would indicate we're not having problems with parallel programming. Name a search engine, database query engine, weather simulation software, etc, that isn't already multithreaded. Where's the issue?

Problems that can and should be parallelized in software already are for the most part. There is no issue here.

Business processes are often serial (step B depends on output of step A). That's what a lot of corporate programmers work on. And even these have steps that are done in parallel or can run multiple instances of a process in parallel. Anyone working on a web application or j2ee infrastructure is probably running lots of their small, serialized problems, in parallel. Again, there is no issue here.

Maybe because we've seen the hype cycle before... by alispguru · 2008-03-25 02:10 · Score: 3, Insightful

Anybody else remember the great clock cycle stall of the 1980's? During that period, Moore's Law operated in a manner closer to its original statement - the big news was the drop in cost per transistor, not raw CPU speed. The general wisdom at the time was that parallelism was going to be the way to get performance.

And then, we entered the die shrink and clock speed up era, clock speeds doubled every 14 months or so for ten years, and we went smoothly from 60 MHz to 2 GHz. Much of the enthusiasm for parallel programming died away - why sweat blood making a parallel computer when you can wait a few years and get most of the same performance?

Clock speeds hit a wall again about five years ago. If the rate of increase stays small for another five years, the current cycle drought will have outlasted the 1980's slowdown. I have a great deal of sympathy for parallel enthusiasm (I hacked on a cluster of 256 Z80's in the early 80's), but I think it won't really take off until we really have no other choice, because parallelism is hard.

--

To a Lisp hacker, XML is S-expressions in drag.

Re:Duh? by religious+freak · 2008-03-25 03:57 · Score: 2, Insightful

I think it's reasonable to assume programming skill among developers would follow a bell-curve, in which case not only is your example misleading, it's not applicable.

--
If you can read this... 01110101 01110010 00100000 01100001 00100000 01100111 01100101 01100101 01101011

Re:Duh? by kscguru · 2008-03-25 05:42 · Score: 2, Insightful

Name a single real world problem that doesn't parallelize. I've asked this question on slashdot on several occasions and I've never received a positive reply. GUIs (main event loop) for word processors, web browsers, etc. (Java feels slow because GUI code in Java is slower than human perception). Compilation (see below - it's a serial problem with a just-good-enough parallel solution). State machine simulations like virtualization or emulation. Basically, any task where the principle unit of work is either processing human input or complex state transitions. Only numerical simulations - problems with simple state transitions - parallelize well.

Real world problems like search, FEA, neural nets, compilation, database queries and weather simulation all parallelize well. Problems like orbital mechanics don't parallelize as easily but then they don't need parallelism to achieve bounded answers in faster than real time. FEA, neural nets, and weather simulation are almost entirely number crunching - a small set of initial data plus a very large number of matrix multiplications. These are embarrassingly parallel, are well-understood and easy to optimize. The tradeoffs, in terms of locking / memory sharing / communication overheads, are documented by the past 40 years of literature; innovation is in new memory architectures or locking schemes or access patterns that tweak the costs of a small part of the program a little. And only ~1% of programmers out there even work on these sorts of problems.

Search and database queries do parallelize well, but not because of multiple processors. These two operations are fundamentally I/O bound - the data sets are too large to fit into memory, so you switch to an event model and do processing when data arrives. More raw CPU speed helps only a little (maybe 1% of the processing can overlap) - the actual gain is in larger caches and memory hierarchies. I doubt the original articles meant to call "overlapping I/O" parallel programming.

Compilation is expressly NOT a parallelizable problem. You may think it is - you fire off make / distcc and a whole storm of compilation happens - but you accept this only because compilers skip even the easiest inter-file optimizations because even attempting them serializes the problem so thoroughly that it becomes single-threaded. All the gains of JIT compilers are possible with static compilation too - but the cost of doing so is too high, so static compilers do very little optimization and JIT compilers get impressive gains with very simple optimizations on hot-paths despite terrible type-checking overheads. Compilation is in a mediocre state now - and we have dozens of languages prospering at different points on the cost curves - because it is an intrinsically SERIAL problem of fantastic complexity. I could write the most complex neural network algorithms on a single sheet of paper; even the parsing tree for the simplest languages won't fit on that page, much less optimization passes.

Most computers in this world are running code to solve inherently serial problems. Saying that numerical methods' sort of parallel programming has broader applicability is ignorant of all the problems outside that narrow area. Sorry.

--

A witty [sig] proves nothing. --Voltaire

Re:Duh? by Lally+Singh · 2008-03-25 05:52 · Score: 4, Insightful

Ugh.

Yes you can parallelize a VR system quite well. You can simulate a couple dozen NPCs per core, then synchronize on the collisions between the few that actually collide. You still get a nice speedup. It ain't 100% linear, but it can be pretty good. The frame-by-frame accuracy requirements are often low enough that you can fuzz a little on consistency for performance (that's usually done already. ever heard "If it looks right, it's right?").

Parallel programming is how we get more speed out of modern architectures. We're being told that we're not going to see Moore's law expand in GHz like it used to, but in multiple cores. Nobody things it's a panacea, except maybe the 13yr old xbox kiddies, but they never counted.

As for making impossible into possible, sure it will. There are lots of things you couldn't do with your machine 10-15 yrs ago, you can do now. Many systems have performance inflection points. As we get faster (either with a faster clock or a larger number of cores), we're going to cross more of them. I remember when you couldn't decode an mp3 in real time on a desktop machine. With the I/O and space costs of uncompressed music, that meant that you didn't really listen to music from your computer. Impossible -> Possible.

--
Care about electronic freedom? Consider donating to the EFF!

Real world solution by Anonymous Coward · 2008-03-25 09:46 · Score: 1, Insightful

First of all, the quote is "Nine women can't have a baby in one month." Parallelism shortens time (to one month) by increasing processors (women). There is only one product (a baby).

That said, pregnancy is a good real world parallelism example. The baby's brain, skeleton, liver, & everything else are growing at the same time. Sure, some body parts must be grown before others. Humans are multi-tasking chemical machines.

Slashdot Mirror

More Interest In Parallel Programming Outside the US?

26 of 342 comments (clear)