Slashdot Mirror


New Languages Vs. Old For Parallel Programming

joabj writes "Getting the most from multicore processors is becoming an increasingly difficult task for programmers. DARPA has commissioned a number of new programming languages, notably X10 and Chapel, written especially for developing programs that can be run across multiple processors, though others see them as too much of a departure to ever gain widespread usage among coders."

26 of 321 comments (clear)

  1. Parallel is here to stay but not for every app by Meshach · · Score: 3, Insightful

    Parallel is not going to go anywhere but is only really valid for certain types if applications. Larger items like operating systems or most system tasks need it. Whether it is worthwhile in lowly application land is a case by case decision; but will mostly depend on the skill of programmers involved and the budget for the particular application in question.

    --
    "Maybe this world is another planet's hell"
    Aldous Huxley
    1. Re:Parallel is here to stay but not for every app by Nursie · · Score: 4, Informative

      How blinkered are you?

      There exist whole classes of software that have been doing parallel execution, be it through threads, processes or messaging, for decades.

      Look at any/all server software, for god's sake, look at apache, or any database, or any transaction engine.

      If you're talking about desktop apps then make it clear. The thing with most of those is that the machines far exceed their requirements with a single core, most of the time. But stuff like video encoding has been threaded for a while too.

    2. Re:Parallel is here to stay but not for every app by Daniel+Dvorkin · · Score: 4, Interesting

      True enough, but the class of applications for which parallel processing is useful is growing rapidly as programmers learn to think in those terms. Any program with a "for" or "while" loop in which the results of one iteration do not depend on the results of the previous iteration, as well as a fair number of such loops in which the results do have such a dependency, is a candidate for parallelization -- and that means most of the programs which most programmers will ever write. We just need the languages not to make coding this way too painful.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    3. Re:Parallel is here to stay but not for every app by grumbel · · Score: 4, Insightful

      Most computing intensive problems that a user will encounter at home are easily parallelizable, i.e. video encoding, gaming, photoshop filters, webbrowsing and so on. The amount of times where I maxed out a single CPU and the given problem would not have been to some large degree parallelizable are close to zero.

      The trouble is that they are only "easy" parallelizable in concept, implementing parallelization on an exciting serial codebase is where it gets messy.

    4. Re:Parallel is here to stay but not for every app by AuMatar · · Score: 4, Insightful

      And how many of those cores are above 2% utilization for 90% of the day? Parallelization on the desktop is a solution is search of a problem- we have in a single core dozens of times what the average user needs. My email, web browsing, word processor, etc aren't cpu limited. They're network limited, and after that they're user limited (a human can only read so many slashdot stories a minute). There's no point in anything other than servers having 4 or 8 cores. But if Intel doesn't fool people into thinking they need new computers their revenue will go down, so 16 core desktops next year it is.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    5. Re:Parallel is here to stay but not for every app by bertok · · Score: 3, Interesting

      The % utilization metric is a red herring. Most servers are underutilized by that metric, which is why VMware is making so much money consolidating them!

      Users don't actually notice, or care, about CPU utilization. What users notice, is latency. If my computer is 99% idle, that's fine, but I want it to respond to mouse clicks in a timely fashion. I don't want to wait, even if it's just a few hundred milliseconds. This is where parallel computation can bring big wins.

      One thing I noticed is that MS SQL Server still has its default "threshold for query parallelism" set to "5", which AFAIK means that if the query planner estimates that a query will take more than 5 seconds, it'll attempt a parallel query plan instead. That's insane! I don't know what kind of users Microsoft is thinking of, but in my world, if a form takes 5 seconds to display, it's way too slow to be considered acceptable. Many servers now have 8 or more cores, and 24 (4x hexacore) is going to be common for database servers very soon. In that picture, even if you only consider a 15x speedup due to overhead, 5 seconds becomes something like 300 milliseconds!

      Ordinary Windows applications can benefit from the same kind of speedup. For example, a huge number of applications use compression internally (all Java JAR files, of the docx-style Office 2007 files, etc...), yet the only parallel compressor I know of is WinRAR, which really does get 4x the speed on my quad-core. Did you know that the average compression rate for a normal algorithm like zip is something like 10MB/sec/core? That's pathetic. A Core i7 with 8 threads could probably do the same thing at 60 MB/sec or more, which is more in line with, say, gigabit ethernet speeds, or a typical hard-drive.

      In other words, for a large class of apps, your hard-drive is not the bottleneck, your CPU is. How pathetic is that? A modern CPU has 4 or more cores, and it's busy hammering just one of those while your hard-drive, a mechanical component, is waiting to send it more data.

      You wait until you get an SSD. Suddenly, a whole range of apps become "cpu limited".

    6. Re:Parallel is here to stay but not for every app by grumbel · · Score: 3, Insightful

      Web-browsing for instance, is not inherently paralizable.

      Of course it is. Each Tab can be rendered in parallel, each JPEG can be decoded in parallel, each character can be antialiased in parallel, each download can happen in parallel, all the pixel that make up the final image can be alpha blended in parallel.

      Yes, at some point you hit a wall where you will be left with a piece of code that you have to evaluate sequentially, you can't alpha blend half a pixel and your layouting engine might also require some sequential evalution. My point however is that none of those are CPU intensive enough to be a problem for normal use.

      These days I don't have to wait for my computer because it has to solve some tiny non-parallelizable task, but because my computer is doing some heavy data crunching that could be done in parallel without much of a problem.

      Or to put it another way: Half the for-loops out there just process each element in a list, the major reason why the don't do it in parallel is because the programming language makes it extremely hard to do so, not because there is some fundamental reason why it can't be done.

  2. We need async frameworks too! by PhrostyMcByte · · Score: 4, Insightful

    A lot of problems are I/O driven -- I would like to see more database client libraries allow a full async approach that lets us not block the threads we are trying to do concurrent work on.

  3. Old languages designed for parallel processing? by number6x · · Score: 4, Informative

    Erlang is an older established language designed for parallel processing.

    Erlang was first developed in 1986, making it about a decade older than Java or Ruby. It is younger than Perl or C, and just a tad older than Python. It is a mature language with a large support community, especially in industrial applications. It is time tested and proven.

    It is also Open source and offers many options for commercial support.

    Before anyone at DARPA thinks that they can design a better language for concurrent parallel programming then I think they should be forced to spend 1 year learning Ada, and a second year working in Ada. If they survive they will most likely be cured of the thought that the Defense department can design good programming languages

    1. Re:Old languages designed for parallel processing? by coppro · · Score: 4, Informative

      Erlang is probably the best language for servers and similar applications available. Not only is in inherently parallel (though they've only recently actually made the engine multithreaded, as the paralellism is in the software), but it is very easily networked as well. As a result, a well-written Erlang program can only be taken down by simultaneously killing an entire cluster of computers.

      What's more, it has a little-seen feature of being able to handle code upgrades to most any component of the program without ever stopping - it keeps two versions of each module (old and new) in memory, and code can be written to automatically ensure a smooth transition into the new code when the upgrade occurs.

      If I recall correctly, the Swedish telecom where Erlang was designed had one server running it with 7 continuous years uptime.

  4. Awful example in the article by theNote · · Score: 3, Insightful

    The example in the article is atrocious.

    Why would you want the withdrawal and balance check to run concurrently?

    1. Re:Awful example in the article by awol · · Score: 3, Interesting

      The example in the article is atrocious.

      Why would you want the withdrawal and balance check to run concurrently?

      Because I can do a whole lot of "local" withdrawal processing whilst my balance check is off checking the canonical source of balance information. If it's comes back OK then the work I have been doing in parallel is now commitable work and my transaction is done. Perhaps in no more time than either of the balance check or the withdrawal whichever is the longest. Whilst the balance check/withdrawal example may seem ridiculous. There are some very interesting applications of this kind of problem in securities (financial) trading systems where the canonical balances of different instruments would conveniently (and some times mandatorily) stored in different locations and some complex synthetic transactions require access to balances from more than one instrument in order to execute properly.

      It seems to me that most of the interesting parallism problems relate to distributed systems and it is not just a question of N phase commit databases but rather a construct of "end to end" dependencies in your processing chain where the true source of data cannot be accessed from all the nodes in the cluster at the same time from a procedural perspective.

      It is this fact that to me suggests that the answer to these issues is a radical change in language toward the functional or logical types of languages like haskel and prolog with erlang being a very interesting place on that path for right now.

      --
      "The first thing to do when you find yourself in a hole is stop digging."
  5. Re:Parallel programming is dead. No one uses it... by Nursie · · Score: 3, Informative

    Bullshit.

    Tell that to apache, and oracle, and basically anything that runs in a server room.

  6. Re:Parallel programming is dead. No one uses it... by beelsebob · · Score: 3, Insightful

    Threading i don't count as parallel processing for the desktop. I don't even hear of any games or applications built for parallel.

    Uhhhhhhhhhhh? Yes, well done with that...

  7. Re:What's so hard? by beelsebob · · Score: 4, Informative

    It's not creating threads that's hard - it's getting them to communicate with each other, without ever getting into a situation where thread a is waiting for thread b and thread b is waiting for thread a that's hard.

  8. Clojure by slasho81 · · Score: 4, Interesting

    Check out Clojure. The only programming language around that really addresses the issue of programming in a multi-core environment. It's also quite a sweet language besides that.

  9. A view based on history... by Anonymous Coward · · Score: 3, Insightful

    Rehash time...

    Parallelism typically falls into two buckets: Data parallel and functional parallel. The first challenge for the general programming public is identifying what is what. The second challenge is synchronizing parallelism in as bug free way as possible while retaining the performance advantage of the parallelism.

    Doing fine-grained parallelism - what the functional crowd is promising, is something that will take a *long* time to become mainstream (Other interesting examples are things like LLVM and K, but they tend to focus more on data parallel). Functional is too abstract for most people to deal with (yes, I understand it is easy for *you*).

    Short term (i.e. ~5 years), the real benefit will be in threaded/parallel frameworks (my app logic can be serial, tasks that my app needs happen in the background).

    Changing industry tool-chains to something entirely new takes many many years. What most likely will happen is transactional memory will make it into some level of hardware, enabling faster parallel constructs, a cool new language will pop up formalizing all of these features. Someone will tear that cool new language apart by removing the rigor and giving it C/C++ style syntax, then the industry will start using it

  10. Established vs new programming languages for HPC by Raul654 · · Score: 3, Informative

    This is a subject near and dear to my heart. I got to participate in one of the early X10 alpha tests (my research group was asked to try it out and give feedback to Vivek Sarker's IBM team). Since then, I've worked with lots of other specialized programming HPC programming languages.

    One extremely important aspect of supercomputing, a point that many people fail to grasp, is that application code tends to live a long, long, long time. Far longer than the machines themselves. Rewriting code is simply too expensive and economically inefficient. At Los Alamos National Lab, much of the source code they run are nuclear simulations written Fortran 77 or Fortran 90. Someone might have updated it to use MPI, but otherwise it's the same program. So it's important to bear in mind that those older languages, while not nearly as well suited for parallelism (either for programmer ease-of-use/effeciency, or to allow the compiler to do deep analysis/optimization/scheduling), are going to be around for a long time yet.

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  11. Re:What's so hard? by Anonymous Coward · · Score: 4, Funny

    How many lines does it take you to parallelize this with pthreads in C?

    One. Newline characters are for wimps.

  12. Re:What's so hard? by Unoti · · Score: 3, Interesting

    The fact that it seems so simple at first is where the problem starts. You had no trouble in your program. One program. That's a great start. Now do something non-trivial. Say, make something that simulates digital circuits-- and gates, or gates, not gates. Let them be wired up together. Accept an arbitrarily complex setup of digital logic gates. Have it simulate the outputs propagating to the inputs. And make it so that it expands across an arbitrary number of threads, and make it expand across an arbitrary number of processes, both on the same computer and on other computers on the same network.

    There are some languages and approaches you could choose for such a project that will help you avoid the kinds of pitfalls that await you, and provide most or all of the infrastructure that you'd have to write yourself in other languages.

    If you're interested in learning more about parallel programming, why it's hard, and what can go wrong, and how to make it easy, I suggest you read a book about Erlang. Then read a book about Scala.

    The thing is, it looks easy at first, and it really is easy at first. Then you launch your application into production, and stuff goes real funny and it's nigh unto impossible to troubleshoot what's wrong. In the lab, it's always easy. With multithreaded/multiprocess/multi-node systems, you've got to work very very hard to make them mess up in the lab the same way they will in the real world. So it seems like not a big deal at first until you launch the stuff and have to support it running every day in crazy unpredictable conditions.

  13. Re:Chapel? by sam_handelman · · Score: 3, Funny

    You know what, I don't think I'm going to use modern English, either.

      Don't you know that early modern English was invented to have something standard into which the bible could be translated? For shame!

      As a devoted secularist, I'll just burn all my shakespeare and rushdie after I delete all my perl code.

    --
    The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
  14. The mess by Animats · · Score: 4, Interesting

    I've been very disappointed in parallel programming support. The C/C++ community has a major blind spot in this area - they think parallelism is an operating system feature, not a language issue. As a result, C and C++ provide no assistance in keeping track of what locks what. Hence race conditions. In Java, the problem was at least thought about, but "synchronized" didn't work out as well as expected. Microsoft Research people have done some good work in this area, and some of it made it into C#, but they have too much legacy to deal with.

    At the OS level, in most operating systems, the message passing primitives suck. The usual approach in the UNIX/Linux world is to put marshalling on top of byte streams on top of sockets. Stuff like XML and CORBA, with huge overhead. The situation sucks so bad that people think JSON is a step forward.

    What you usually want is a subroutine call; what the OS usually gives you is an I/O operation. There are better and faster message passing primitives (see MsgSend/MsgReceive in QNX), but they've never achieved any traction in the UNIX/Linux world. Nobody uses System V IPC, a mediocre idea from the 1980s. For that matter, there are still applications being written using lock files.

    Erlang is one of the few parallel languages actually used to implement large industrial applications.

  15. Multi-threaded or Parallel? by ipoverscsi · · Score: 5, Insightful

    I have not read the article (par for the course here) but I think there is probably some confusion among the commenters regarding the difference between multi-threading programs and parallel algorithms. Database servers, asynchronous I/O, background tasks and web servers are all examples of multi-threaded applications, where each thread can run independently of every other thread with locks protecting access to shared objects. This is different from (and probably simpler than) parallel programs. Map-reduce is a great example of a parallel distributed algorithm, but it is only one parallel computing model: Multiple Instruction / Multiple Data (MIMD). Single Instruction / Multiple Data (SIMD) algorithms implemented on super-computers like Cray (more of a vector machine, but it's close enough to SIMD) and MasPar systems require different and far more complex algorithms. In addition, purpose-built supercomputers may have additional restrictions on their memory accesses, such as whether multiple CPUs can concurrently read or write from memory.

    Of course, the Cray and Maspar systems are purpose-built machines, and, much like special-build processors have fallen in performance to general purpose CPUs, Cray and Maspar systems have fallen into disuse and virtual obscurity; therefore, one might argue that SIMD-type systems and their associated algorithms should be discounted. But, there is a large class of problems -- particularly sorting algorithms -- well suited to SIMD algorithms, so perhaps we shouldn't be so quick to dismiss them.

    There is a book called An Introduction to Parallel Algorithms by Joseph JaJa (http://www.amazon.com/Introduction-Parallel-Algorithms-Joseph-JaJa/dp/0201548569) that shows some of the complexities of developing truly parallel algorithms.

    (Disclaimer: I own a copy of that book but otherwise have no financial interests in it.)

  16. Re:Chapel? by Raffaello · · Score: 3, Insightful

    No widely spoken human natural language was "invented," Modern English included. Where do people come up with these things? Modern English evolved out of Middle English, just as Spanish evolved out of Latin, etc. Modern English was not "invented" in any meaningful sense of "invernted."

    reference for WikiWeenies.

  17. Re:How to Solve the Parallel Programming Crisis by Louis+Savain · · Score: 4, Insightful

    Sorry but, IMO, the cell is a perfect example of how not to design a multicore processor. Heterogenous processors introduce nothing new to the table of solutions that was not already there. We had systems with CPUs and GPUs before the Cell (or Intel's Larrabee and AMD's Fusion) showed up. Everybody knows that they're a pain in the ass to program. Neither CUDA nor OpenCL nor Microsoft's much ballyhooed TBB (threaded building blocks) will change that fact.

    My point is that one does not design a parallel processor and then come up with a programming model to exploit it. It should be the other way around. The programming model should come first. One should design a model that makes parallel programming easy and the resulting apps rock-solid. Only then, after you have perfected your model, should you even consider designing a processor to support the model.

    IOW, everybody's doing it wrong, and by everybody, I mean the all big players in the multicore hardware/software industry: Intel, Microsoft, IBM, Sun-Oracle, AMD, ARM, Apple, FreeScale, etc. The major computer science centers who are getting a lot of research money form the industry are not helping either since they have to kowtow to the likes of Intel and AMD whose main interest is to safeguard their installed base and preserve continuity.

    It makes no difference. When the pain becomes unbearable (it's all about money), it will suddenly dawn on everybody that what is needed is to break away from the past.

  18. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 4, Interesting

    Whoever told you that is mistaken.

    The easiest way to take advantage of a multiprocessing environment is to use techniques that will be familiar to any high level programmer. For example, you don't write for loops, you call functions written in a low level language to do things like that for you. Those low level functions can be easily parallelized, giving all your code a boost.