New Languages Vs. Old For Parallel Programming
joabj writes "Getting the most from multicore processors is becoming an increasingly difficult task for programmers. DARPA has commissioned a number of new programming languages, notably X10 and Chapel, written especially for developing programs that can be run across multiple processors, though others see them as too much of a departure to ever gain widespread usage among coders."
Parallel is not going to go anywhere but is only really valid for certain types if applications. Larger items like operating systems or most system tasks need it. Whether it is worthwhile in lowly application land is a case by case decision; but will mostly depend on the skill of programmers involved and the budget for the particular application in question.
"Maybe this world is another planet's hell"
Aldous Huxley
A lot of problems are I/O driven -- I would like to see more database client libraries allow a full async approach that lets us not block the threads we are trying to do concurrent work on.
Erlang is an older established language designed for parallel processing.
Erlang was first developed in 1986, making it about a decade older than Java or Ruby. It is younger than Perl or C, and just a tad older than Python. It is a mature language with a large support community, especially in industrial applications. It is time tested and proven.
It is also Open source and offers many options for commercial support.
Before anyone at DARPA thinks that they can design a better language for concurrent parallel programming then I think they should be forced to spend 1 year learning Ada, and a second year working in Ada. If they survive they will most likely be cured of the thought that the Defense department can design good programming languages
The example in the article is atrocious.
Why would you want the withdrawal and balance check to run concurrently?
Bullshit.
Tell that to apache, and oracle, and basically anything that runs in a server room.
Threading i don't count as parallel processing for the desktop. I don't even hear of any games or applications built for parallel.
Uhhhhhhhhhhh? Yes, well done with that...
It's not creating threads that's hard - it's getting them to communicate with each other, without ever getting into a situation where thread a is waiting for thread b and thread b is waiting for thread a that's hard.
Check out Clojure. The only programming language around that really addresses the issue of programming in a multi-core environment. It's also quite a sweet language besides that.
Rehash time...
Parallelism typically falls into two buckets: Data parallel and functional parallel. The first challenge for the general programming public is identifying what is what. The second challenge is synchronizing parallelism in as bug free way as possible while retaining the performance advantage of the parallelism.
Doing fine-grained parallelism - what the functional crowd is promising, is something that will take a *long* time to become mainstream (Other interesting examples are things like LLVM and K, but they tend to focus more on data parallel). Functional is too abstract for most people to deal with (yes, I understand it is easy for *you*).
Short term (i.e. ~5 years), the real benefit will be in threaded/parallel frameworks (my app logic can be serial, tasks that my app needs happen in the background).
Changing industry tool-chains to something entirely new takes many many years. What most likely will happen is transactional memory will make it into some level of hardware, enabling faster parallel constructs, a cool new language will pop up formalizing all of these features. Someone will tear that cool new language apart by removing the rigor and giving it C/C++ style syntax, then the industry will start using it
This is a subject near and dear to my heart. I got to participate in one of the early X10 alpha tests (my research group was asked to try it out and give feedback to Vivek Sarker's IBM team). Since then, I've worked with lots of other specialized programming HPC programming languages.
One extremely important aspect of supercomputing, a point that many people fail to grasp, is that application code tends to live a long, long, long time. Far longer than the machines themselves. Rewriting code is simply too expensive and economically inefficient. At Los Alamos National Lab, much of the source code they run are nuclear simulations written Fortran 77 or Fortran 90. Someone might have updated it to use MPI, but otherwise it's the same program. So it's important to bear in mind that those older languages, while not nearly as well suited for parallelism (either for programmer ease-of-use/effeciency, or to allow the compiler to do deep analysis/optimization/scheduling), are going to be around for a long time yet.
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
One. Newline characters are for wimps.
The fact that it seems so simple at first is where the problem starts. You had no trouble in your program. One program. That's a great start. Now do something non-trivial. Say, make something that simulates digital circuits-- and gates, or gates, not gates. Let them be wired up together. Accept an arbitrarily complex setup of digital logic gates. Have it simulate the outputs propagating to the inputs. And make it so that it expands across an arbitrary number of threads, and make it expand across an arbitrary number of processes, both on the same computer and on other computers on the same network.
There are some languages and approaches you could choose for such a project that will help you avoid the kinds of pitfalls that await you, and provide most or all of the infrastructure that you'd have to write yourself in other languages.
If you're interested in learning more about parallel programming, why it's hard, and what can go wrong, and how to make it easy, I suggest you read a book about Erlang. Then read a book about Scala.
The thing is, it looks easy at first, and it really is easy at first. Then you launch your application into production, and stuff goes real funny and it's nigh unto impossible to troubleshoot what's wrong. In the lab, it's always easy. With multithreaded/multiprocess/multi-node systems, you've got to work very very hard to make them mess up in the lab the same way they will in the real world. So it seems like not a big deal at first until you launch the stuff and have to support it running every day in crazy unpredictable conditions.
You know what, I don't think I'm going to use modern English, either.
Don't you know that early modern English was invented to have something standard into which the bible could be translated? For shame!
As a devoted secularist, I'll just burn all my shakespeare and rushdie after I delete all my perl code.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
I've been very disappointed in parallel programming support. The C/C++ community has a major blind spot in this area - they think parallelism is an operating system feature, not a language issue. As a result, C and C++ provide no assistance in keeping track of what locks what. Hence race conditions. In Java, the problem was at least thought about, but "synchronized" didn't work out as well as expected. Microsoft Research people have done some good work in this area, and some of it made it into C#, but they have too much legacy to deal with.
At the OS level, in most operating systems, the message passing primitives suck. The usual approach in the UNIX/Linux world is to put marshalling on top of byte streams on top of sockets. Stuff like XML and CORBA, with huge overhead. The situation sucks so bad that people think JSON is a step forward.
What you usually want is a subroutine call; what the OS usually gives you is an I/O operation. There are better and faster message passing primitives (see MsgSend/MsgReceive in QNX), but they've never achieved any traction in the UNIX/Linux world. Nobody uses System V IPC, a mediocre idea from the 1980s. For that matter, there are still applications being written using lock files.
Erlang is one of the few parallel languages actually used to implement large industrial applications.
I have not read the article (par for the course here) but I think there is probably some confusion among the commenters regarding the difference between multi-threading programs and parallel algorithms. Database servers, asynchronous I/O, background tasks and web servers are all examples of multi-threaded applications, where each thread can run independently of every other thread with locks protecting access to shared objects. This is different from (and probably simpler than) parallel programs. Map-reduce is a great example of a parallel distributed algorithm, but it is only one parallel computing model: Multiple Instruction / Multiple Data (MIMD). Single Instruction / Multiple Data (SIMD) algorithms implemented on super-computers like Cray (more of a vector machine, but it's close enough to SIMD) and MasPar systems require different and far more complex algorithms. In addition, purpose-built supercomputers may have additional restrictions on their memory accesses, such as whether multiple CPUs can concurrently read or write from memory.
Of course, the Cray and Maspar systems are purpose-built machines, and, much like special-build processors have fallen in performance to general purpose CPUs, Cray and Maspar systems have fallen into disuse and virtual obscurity; therefore, one might argue that SIMD-type systems and their associated algorithms should be discounted. But, there is a large class of problems -- particularly sorting algorithms -- well suited to SIMD algorithms, so perhaps we shouldn't be so quick to dismiss them.
There is a book called An Introduction to Parallel Algorithms by Joseph JaJa (http://www.amazon.com/Introduction-Parallel-Algorithms-Joseph-JaJa/dp/0201548569) that shows some of the complexities of developing truly parallel algorithms.
(Disclaimer: I own a copy of that book but otherwise have no financial interests in it.)
No widely spoken human natural language was "invented," Modern English included. Where do people come up with these things? Modern English evolved out of Middle English, just as Spanish evolved out of Latin, etc. Modern English was not "invented" in any meaningful sense of "invernted."
reference for WikiWeenies.
Sorry but, IMO, the cell is a perfect example of how not to design a multicore processor. Heterogenous processors introduce nothing new to the table of solutions that was not already there. We had systems with CPUs and GPUs before the Cell (or Intel's Larrabee and AMD's Fusion) showed up. Everybody knows that they're a pain in the ass to program. Neither CUDA nor OpenCL nor Microsoft's much ballyhooed TBB (threaded building blocks) will change that fact.
My point is that one does not design a parallel processor and then come up with a programming model to exploit it. It should be the other way around. The programming model should come first. One should design a model that makes parallel programming easy and the resulting apps rock-solid. Only then, after you have perfected your model, should you even consider designing a processor to support the model.
IOW, everybody's doing it wrong, and by everybody, I mean the all big players in the multicore hardware/software industry: Intel, Microsoft, IBM, Sun-Oracle, AMD, ARM, Apple, FreeScale, etc. The major computer science centers who are getting a lot of research money form the industry are not helping either since they have to kowtow to the likes of Intel and AMD whose main interest is to safeguard their installed base and preserve continuity.
It makes no difference. When the pain becomes unbearable (it's all about money), it will suddenly dawn on everybody that what is needed is to break away from the past.
Whoever told you that is mistaken.
The easiest way to take advantage of a multiprocessing environment is to use techniques that will be familiar to any high level programmer. For example, you don't write for loops, you call functions written in a low level language to do things like that for you. Those low level functions can be easily parallelized, giving all your code a boost.