New Languages Vs. Old For Parallel Programming
joabj writes "Getting the most from multicore processors is becoming an increasingly difficult task for programmers. DARPA has commissioned a number of new programming languages, notably X10 and Chapel, written especially for developing programs that can be run across multiple processors, though others see them as too much of a departure to ever gain widespread usage among coders."
Parallel is not going to go anywhere but is only really valid for certain types if applications. Larger items like operating systems or most system tasks need it. Whether it is worthwhile in lowly application land is a case by case decision; but will mostly depend on the skill of programmers involved and the budget for the particular application in question.
"Maybe this world is another planet's hell"
Aldous Huxley
A lot of problems are I/O driven -- I would like to see more database client libraries allow a full async approach that lets us not block the threads we are trying to do concurrent work on.
Do your programs ever leak memory? Did you have to work with a team of 100+ SWE's to write the program? Did you have technical specs to satisfy, or was this a weekend project? This is the difference between swimming 100 meters and sailing across the Pacific.
The example in the article is atrocious.
Why would you want the withdrawal and balance check to run concurrently?
Threading i don't count as parallel processing for the desktop. I don't even hear of any games or applications built for parallel.
Uhhhhhhhhhhh? Yes, well done with that...
How many lines does it take you to parallelize this with pthreads in C?
for (i = 0; i < 1000; i++)
c[i] = a[i] + b[i];
If it takes you more than 2 lines, then your "language" is too hard to be used everywhere by everyone.
Rehash time...
Parallelism typically falls into two buckets: Data parallel and functional parallel. The first challenge for the general programming public is identifying what is what. The second challenge is synchronizing parallelism in as bug free way as possible while retaining the performance advantage of the parallelism.
Doing fine-grained parallelism - what the functional crowd is promising, is something that will take a *long* time to become mainstream (Other interesting examples are things like LLVM and K, but they tend to focus more on data parallel). Functional is too abstract for most people to deal with (yes, I understand it is easy for *you*).
Short term (i.e. ~5 years), the real benefit will be in threaded/parallel frameworks (my app logic can be serial, tasks that my app needs happen in the background).
Changing industry tool-chains to something entirely new takes many many years. What most likely will happen is transactional memory will make it into some level of hardware, enabling faster parallel constructs, a cool new language will pop up formalizing all of these features. Someone will tear that cool new language apart by removing the rigor and giving it C/C++ style syntax, then the industry will start using it
If threading isn't parallelism, then what is? At what level of separation between separate streams of execution does an application become "parallel"?
We all know what to do, but we don't know how to get re-elected once we have done it
Making a threaded application in C isn't difficult. Testing and debugging said application is. Given that threads share memory, rigorously testing buffer overflow conditions becomes doubly important. In addition, adding threading introduces a whole new set of potential errors (such as race conditions, deadlocks, etc.) that need to be tested for.
Its easy enough to create a multi-threaded version of a program when its for personal use. However, there are a number of issues that arise whenever a threaded program interacts with the (potentially malicious) outside world, and these issues are not trivial to test for or fix. That's why I think that parallel programs are going to be increasingly written in functional programming languages (Common Lisp, Haskell, Scala, etc.). The limitations on side effects that functional languages impose reduces the amount of interaction between threads, and reduces the probability that a failure in a single thread will propagate through the entire application.
We all know what to do, but we don't know how to get re-elected once we have done it
That's what libraries are for. This is what you youngins continually forget. Design and implement properly libraries of useful code ONCE and use them many times.
You could easily write a "vector_add" function which spawns (or unlocks pre-spawned which is smarter) threads to perform a variety of tasks. Then from your application a single line of code would perform an optimized parallel vector addition or whatever.
In fact, a smart DSP lib today would do just that. Pre-spawn a bunch of threads then host a job server which unlocks threads to work on given tasks. That way you have single line functions like vector_add(in1, in2, out, size), etc...
So you could actualy write really easy to read parallel programs in C. You just have to know the first thing about software development.
In short, your rant is a product of not knowing what you are doing.
Erlang is quite OK for non-distributed programming. Its model of threads exchanging messages is just a natural fit for it. As it is for multicore systems.
AC here is right, but also missing something. I often hear about things being "easy". Often when people say something is "easy" they really mean, "it's easy after it's done." This is one of those things where it's only easy after it's done. The code might look easy after it's created and debugged. But getting to the point where it's created and validated and debugged is much harder in some languages and approaches (e.g. C) than it is in others (e.g. Erlang).
Take someone experienced with multithread, multi process, multi node programming in C. And put them up against someone experienced with same in a language designed for distributed systems. Have them drag race on producing code that expands to an arbitrary number of processes and computers and evenly distributes the load among them in a fault tolerant, smooth way. The person in a language designed for it is going to blow the doors off the person doing it in C in terms of productivity. And that's what these languages are all about.
If you've got a library already present that does exactly what you need, great, and AC is right on target there. But when you don't have such a library, and you almost never will, then it's great to use a tool that makes the job easy to do well and do quickly.
I have not read the article (par for the course here) but I think there is probably some confusion among the commenters regarding the difference between multi-threading programs and parallel algorithms. Database servers, asynchronous I/O, background tasks and web servers are all examples of multi-threaded applications, where each thread can run independently of every other thread with locks protecting access to shared objects. This is different from (and probably simpler than) parallel programs. Map-reduce is a great example of a parallel distributed algorithm, but it is only one parallel computing model: Multiple Instruction / Multiple Data (MIMD). Single Instruction / Multiple Data (SIMD) algorithms implemented on super-computers like Cray (more of a vector machine, but it's close enough to SIMD) and MasPar systems require different and far more complex algorithms. In addition, purpose-built supercomputers may have additional restrictions on their memory accesses, such as whether multiple CPUs can concurrently read or write from memory.
Of course, the Cray and Maspar systems are purpose-built machines, and, much like special-build processors have fallen in performance to general purpose CPUs, Cray and Maspar systems have fallen into disuse and virtual obscurity; therefore, one might argue that SIMD-type systems and their associated algorithms should be discounted. But, there is a large class of problems -- particularly sorting algorithms -- well suited to SIMD algorithms, so perhaps we shouldn't be so quick to dismiss them.
There is a book called An Introduction to Parallel Algorithms by Joseph JaJa (http://www.amazon.com/Introduction-Parallel-Algorithms-Joseph-JaJa/dp/0201548569) that shows some of the complexities of developing truly parallel algorithms.
(Disclaimer: I own a copy of that book but otherwise have no financial interests in it.)
Simple answer: synchronized shared data access.
Technically creating the threads is the easy bit. Making sure that the threads don't fight over shared mutable data is so hard that most people give up. Frinstance, "serializable" isolation is the only really guaranteed isolation level for database transactions under all circumstances, but it can turn your DB into a uniprocessor; most people avoid it. Java's Swing UI gives up on data locking, and just says "only my thread touches the UI" and provides API to book tasks on it.
There exist a lots of cases where threaded parallelism is easy to implement. But many cases are hard, with subtle pathological difficulties. Once you bundle a few of these cases into the same system, correctness becomes impossible to estimate. Add to that the fact that many of the bugs are intermittent, debugger-resistant (schroedingbugs!) and potentially fatal (data corruption, when you encourage liveness; deadlocks, when you vote for safety), and you have some seriously difficult problems.
Check out Clojure [clojure.org]. The only programming language around that really addresses the issue of programming in a multi-core environment.
That's a rather bold statement. You do realize that those neat features of Clojure like STM or actors weren't originally invented for it? In fact, you could do most (all?) of that in Haskell before Clojure even appeared.
On a side note, while STM sounds great in theory for care-free concurrent programming, the performance penalty that comes with it in existing implementations is hefty. It's definitely a prospective area, but it needs more research before the results are consistently usable in production.
No widely spoken human natural language was "invented," Modern English included. Where do people come up with these things? Modern English evolved out of Middle English, just as Spanish evolved out of Latin, etc. Modern English was not "invented" in any meaningful sense of "invernted."
reference for WikiWeenies.
Sorry but, IMO, the cell is a perfect example of how not to design a multicore processor. Heterogenous processors introduce nothing new to the table of solutions that was not already there. We had systems with CPUs and GPUs before the Cell (or Intel's Larrabee and AMD's Fusion) showed up. Everybody knows that they're a pain in the ass to program. Neither CUDA nor OpenCL nor Microsoft's much ballyhooed TBB (threaded building blocks) will change that fact.
My point is that one does not design a parallel processor and then come up with a programming model to exploit it. It should be the other way around. The programming model should come first. One should design a model that makes parallel programming easy and the resulting apps rock-solid. Only then, after you have perfected your model, should you even consider designing a processor to support the model.
IOW, everybody's doing it wrong, and by everybody, I mean the all big players in the multicore hardware/software industry: Intel, Microsoft, IBM, Sun-Oracle, AMD, ARM, Apple, FreeScale, etc. The major computer science centers who are getting a lot of research money form the industry are not helping either since they have to kowtow to the likes of Intel and AMD whose main interest is to safeguard their installed base and preserve continuity.
It makes no difference. When the pain becomes unbearable (it's all about money), it will suddenly dawn on everybody that what is needed is to break away from the past.
using threads, or even multiple processes, and nobody has found a good model that actually makes it easy to do parallel programming.
The reason is that threads and processes are the wrong tool for the task, they introduce a lot of additional complexity while still failing at giving you fine grained parallelization. They are patchworks that try to add the ability to do parallelization to languages that where build for sequential evaluation, instead of solving the problem from the ground up.
Functional languages look like a much better solution for parallel programming. Without side effects, a very large part of parallel programming problems disappears instantly. There will be a need for some relearning as functional languages require a different approach to tackle a programming problem, but in the end that is what it needs. Look at GPU programming, the reason why it is trivial to apply parallelization to it, is because the infrastructure forces you to write your program in a way that can be parallelized. You can't make your fragment shader go crazy and draw over other pixels and stuff, you have one pixel that you have to fill and no side-effects outside of that and as a result you can throw as much parallel processing power at it as you like.