Slashdot Mirror


New Languages Vs. Old For Parallel Programming

joabj writes "Getting the most from multicore processors is becoming an increasingly difficult task for programmers. DARPA has commissioned a number of new programming languages, notably X10 and Chapel, written especially for developing programs that can be run across multiple processors, though others see them as too much of a departure to ever gain widespread usage among coders."

321 comments

  1. I'm waiting for parallel libs for R by G3ckoG33k · · Score: 1

    I'm waiting for parallel libs for R, even if i'm told that scripted languages won't have much of a future in parallel processing. All I can do is hope. Sigh.

    1. Re:I'm waiting for parallel libs for R by Daniel+Dvorkin · · Score: 2, Interesting

      There are some packages on CRAN that claim to implement parallel processing for R -- go to http://cran.r-project.org/web/packages/ and search for the text "parallel" to find several examples. I haven't tried any of them out yet, but sooner or later I'm going to have to.

      And actually, I think that "scripting" languages in general will have a very bright future in the parallel processing world. If memory management and garbage collection are implemented invisibly (and well!) in the core language, then the programmer can concentrate on the application logic and not have to worry about the kind of allocation headaches discussed in TFA. Python and R, where I spend most of my coding time these days, both offer very nicely implemented versions of function mapping, which I see as the key to making multiple processors useful for a wide variety of tasks. And no, the memory management and GC aren't quite there yet in either language, but they will be.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    2. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 4, Interesting

      Whoever told you that is mistaken.

      The easiest way to take advantage of a multiprocessing environment is to use techniques that will be familiar to any high level programmer. For example, you don't write for loops, you call functions written in a low level language to do things like that for you. Those low level functions can be easily parallelized, giving all your code a boost.

    3. Re:I'm waiting for parallel libs for R by gringer · · Score: 1

      I've used snow a few times. Seems to work quite well for the stuff I'm doing, especially when I'm working on a computer with MPI.

      --
      Ask me about repetitive DNA
    4. Re:I'm waiting for parallel libs for R by wfstanle · · Score: 1

      These low level functions that you speak of... They really are no different from high level functions. A function (or routine) still has to return something. The calling procedure is usually blocked pending the results. It's not just a matter of designing functions that can be run in parallel. Almost the entire program must be designed to take advantage of parallel processing.

    5. Re:I'm waiting for parallel libs for R by Haeleth · · Score: 1

      I think ceoyoyo was referring to the use of higher-order functions like map and reduce. Absent side-effects, these can given parallelised implementations that will divide up the collection they're applied to between multiple processors.

      Yes, the caller is blocked until the function returns, but the function itself is using all available processors and so the caller is blocked for linearly less time. And this is much easier to arrange than having a compiler trying to work out whether it's possible to parallelise an explicit for loop.

    6. Re:I'm waiting for parallel libs for R by Anonymous Coward · · Score: 0

      use snow or snowfall. this should allow you to parrelize with ease

    7. Re:I'm waiting for parallel libs for R by Anonymous Coward · · Score: 0

      Try 'multicore' I've used it successfully and it is really easy to use Ideally I'd like to get RScaLAPACK to work, but I haven't spent much time trying beyond finding out that the RScaLAPACK rpm in the Fedora repository doesn't seem to work on my machine. gl

    8. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 2, Informative

      Haeleth is correct. The problem with a for loop is that it may or may not be parallelizable. There are some compilers that attempt to guess, but they generally do a poor job.

      The secret to efficient programming in interpreted languages is that whenever you want to do something where you'd normally use a for loop, you call a compiled function to do that for you. Those utility functions are generally explicitly either parallelizable or not, so the compiler doesn't have to guess - it knows.

      Suppose I want to do C = A + B, where A and B are large arrays. In a language like C I would write a for loop:

      for (int i=0; iaLength; i++) {
          C[i] = A[i] + B[i]
      }

      That's a fairly trivial example, but that simple loop can foul up a parallelizing compiler. The equivalent in, say Python, is:

      C = A + B

      where the + operator calls a compiled C function. Whoever writes that C function KNOWS the necessary loop can be calculated in parallel, and so can code it that way. From then on, everyone who uses the + operator benefits.

      Yes, you can do the same thing with parallel libraries in compiled languages, but programmers in those languages tend not to be used to that way of thinking. As an interpreted programmer you see many, many clever tricks to, for example, do large array computations using provided compiled functions rather than straightforward for loops. Those tricks are precisely the ones you need to learn as a major component of effective parallel programming.

    9. Re:I'm waiting for parallel libs for R by CarpetShark · · Score: 1

      Which amounts to runnign software, not programming new software.

    10. Re:I'm waiting for parallel libs for R by TeknoHog · · Score: 1

      Yes, the caller is blocked until the function returns, but the function itself is using all available processors and so the caller is blocked for linearly less time. And this is much easier to arrange than having a compiler trying to work out whether it's possible to parallelise an explicit for loop.

      You can also have parallelism at the language level. For example, in the following piece of Fortran

      A = B * sin(C)

      A, B and C can be arrays of the same dimensions. Thus the individual elements

      A[i] = B[i] * sin(C[i])

      can be executed in parallel, given a capable compiler.

      --
      Escher was the first MC and Giger invented the HR department.
    11. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 1

      Um, no.

      I suppose you're one of those hotshots who think C or (shudder) C++ is programming the bare metal, hey?

    12. Re:I'm waiting for parallel libs for R by sjames · · Score: 1

      even if i'm told that scripted languages won't have much of a future in parallel processing.

      That might have been true when parallel processing was almost certainly a cluster. In those days, having a machine capable of parallel processing meant you were looking for the absolute maximum performance and the vast majority would be willing to jump through hoops to do it (including rewriting your app in a compiled language).

      Now that even low cost desktop boxes are likely to have 4 or more cores, that's not necessarily true. There are plenty of people who would modify a program in a scripted language to gain significant benefit, but will not re-implement in C or FORTRAN. In other words, for increasing numbers of problems, the cost of extra parallel CPU cycles is now smaller than the man-hour cost for a re-implementation.

      Meanwhile, the sharp performance divide between scripted languages and formally compiled ones is no longer there. With high level scripting languages compiling to bytecode and encouraging the use of well tested highly optimized algorithms for things like hashing and list handling rather than one-off implementations, plus the potential to use the CPU cache better, scripted programs can be FASTER than compiled sometimes. In the worst case, they're not as much slower as they used to be.

    13. Re:I'm waiting for parallel libs for R by CarpetShark · · Score: 1

      Um, no.

      I suppose you're one of those hotshots who are pyschic, hey? ;)

    14. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 1

      Perhaps you'd like to explain how calling a library function is not programming new software then? bear in mind that using the + operator in C is effectively calling a library function.

    15. Re:I'm waiting for parallel libs for R by CarpetShark · · Score: 1

      Perhaps you'd like to explain how calling a library function is not programming new software then?

      Seriously? You think calling a library function is "programming software"? So using rundll on windows is "programming software"? Using dcop on Linux is "programming software"?

      Sorry, if you want to argue about stuff like this, you'll have to find someone a lot more interested in wasting time.

    16. Re:I'm waiting for parallel libs for R by ceoyoyo · · Score: 1

      Yes, this is a pretty pointless conversation when you simply assume the definition of "library function" is something that meets your extremely narrow (and incorrect) definition, in order that you can pretend you're not hopelessly mistaken.

    17. Re:I'm waiting for parallel libs for R by CarpetShark · · Score: 1

      So now you're claiming that DLLs are not libraries, and that RunDLL does not call a library function? Grow the fuck up, and admit when you're wrong.

    18. Re:I'm waiting for parallel libs for R by alexandre_ganso · · Score: 1
  2. Parallel is here to stay but not for every app by Meshach · · Score: 3, Insightful

    Parallel is not going to go anywhere but is only really valid for certain types if applications. Larger items like operating systems or most system tasks need it. Whether it is worthwhile in lowly application land is a case by case decision; but will mostly depend on the skill of programmers involved and the budget for the particular application in question.

    --
    "Maybe this world is another planet's hell"
    Aldous Huxley
    1. Re:Parallel is here to stay but not for every app by aereinha · · Score: 2, Insightful

      Parallel is not going to go anywhere but is only really valid for certain types if applications.

      Exactly, some problems are inherently serial. These programs would run slower if you made them run in parallel.

    2. Re:Parallel is here to stay but not for every app by Nursie · · Score: 4, Informative

      How blinkered are you?

      There exist whole classes of software that have been doing parallel execution, be it through threads, processes or messaging, for decades.

      Look at any/all server software, for god's sake, look at apache, or any database, or any transaction engine.

      If you're talking about desktop apps then make it clear. The thing with most of those is that the machines far exceed their requirements with a single core, most of the time. But stuff like video encoding has been threaded for a while too.

    3. Re:Parallel is here to stay but not for every app by Daniel+Dvorkin · · Score: 4, Interesting

      True enough, but the class of applications for which parallel processing is useful is growing rapidly as programmers learn to think in those terms. Any program with a "for" or "while" loop in which the results of one iteration do not depend on the results of the previous iteration, as well as a fair number of such loops in which the results do have such a dependency, is a candidate for parallelization -- and that means most of the programs which most programmers will ever write. We just need the languages not to make coding this way too painful.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    4. Re:Parallel is here to stay but not for every app by global_diffusion · · Score: 1

      "Parallel is not going to go anywhere..."

      Really? Look inside any machine nowadays. I'm working on an 8 core machine right now. The individual cores aren't going to get that much faster in the years to come, but the number of cores in a given processor is going to increase dramatically. Unless you want your programs to stay at the same execution speed for the next 5-10 years, you need parallel. And what we need is languages and compilers that abstract away the actual hard work so that anybody can make a parallel program.

    5. Re:Parallel is here to stay but not for every app by cheftw · · Score: 2, Funny

      The individual cores aren't going to get that much faster in the years to come

      I'm sure I've heard something like this before...

      --
      Always back up, never back down. ---- Think you're cool 'cos your uid is prime? Take mine, modulo the one digit integers
    6. Re:Parallel is here to stay but not for every app by Meshach · · Score: 2, Insightful

      I guess in hindsight I should have been clearer and less "blinkered"...

      For real time apps that do transactions Parallel is needed. What I was comparing them to is desktop apps where in many cases the benefit does not really exist. The main point I was trying to get across is that parallel programming is difficult and not needed for every application.

      --
      "Maybe this world is another planet's hell"
      Aldous Huxley
    7. Re:Parallel is here to stay but not for every app by gbjbaanb · · Score: 1

      ah, but all of those server-side apps are effectively doing a single task, multiple times - ie, each request occurs in a different thread, they do not split 1 request onto several CPUs. That's what all this talk of 'desktop parallelism' is all about.

      So now everyone sees multiple cores on the desktop and think to themselves, that data grid is populating really slowly.. I know, we need to parallelise it, that'll make it go faster! (yeah, sure it will)

      I'm sure there are tasks that will benefit from parallel processing (I think of map routing, not that Google's directions are particularly slow) but the vast majority simply won't be worth the effort to code.

    8. Re:Parallel is here to stay but not for every app by grumbel · · Score: 4, Insightful

      Most computing intensive problems that a user will encounter at home are easily parallelizable, i.e. video encoding, gaming, photoshop filters, webbrowsing and so on. The amount of times where I maxed out a single CPU and the given problem would not have been to some large degree parallelizable are close to zero.

      The trouble is that they are only "easy" parallelizable in concept, implementing parallelization on an exciting serial codebase is where it gets messy.

    9. Re:Parallel is here to stay but not for every app by hedwards · · Score: 1

      Except in this case it's likely to be true. Transistors can only be so small before it becomes technically impossible with infeasible being somewhat before that. Additionally electrons can only go so fast through a circuit and you need a certain number of them to work well. Or to put it another way, we're getting relatively close to the point of diminishing returns on that aspect of computing. Sure engineers could make things go quite a bit faster, but realistically it's questionable as to how much faster a processor can get without being unreasonably expensive.

      Cores on the other hand are nowhere near that point, a desktop computer could probably benefit from up to about 16, whereas more specialized requirements benefit from much larger numbers of cores. The main limiting challenges to that are getting them to play well together, energy and physical space on chip/chips.

    10. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      Yeah like no computer will ever use more than 128k of ram. Take your blinders off. Multiple cores and large amounts of memory will allow desktop apps to do things we have never imagined and to do them in concert with the OS and other apps.

      The OS and development tools weigh heavily on all of this which never seems to get mentioned. With Snow Leopard and Grand Canyon Apple is far ahead. Linux/Unix have the OS but not the tools. Windows is screwed.

    11. Re:Parallel is here to stay but not for every app by Moochman · · Score: 1

      I think he means "not going anywhere" as in "here to stay". In other words he's agreeing with you.

      But yeah, I read it the other way too the first time around.

    12. Re:Parallel is here to stay but not for every app by cheftw · · Score: 1

      What about quantum computing, or the next big discovery? Vacuum tubes could only get so small too. Do you not think today's technology will look short-sighted and foolish in 50 years?

      --
      Always back up, never back down. ---- Think you're cool 'cos your uid is prime? Take mine, modulo the one digit integers
    13. Re:Parallel is here to stay but not for every app by Jane+Q.+Public · · Score: 1

      There are a number of advancements in microarchitecture that will keep Moore's Law going for a good while yet.

      The processor makers have added parallel cores because it was easier and less expensive for them to do so. On the other hand, there are "sharpening" techniques that will allow light lithography to be scaled smaller still, and none of the major chip houses have even started using X-Ray lithography yet, which will allow them to go smaller still.

      Yes, it is getting to the point that individual features are experiencing quantum effects. On the other hand, some of those effects can be counteracted (with a little extra engineering), and others can actually be taken advantage of.

      There have also been some materials breakthroughs that will help.

      I am not saying it will be easy or cheap, but it wasn't easy or cheap to get where we already are.

    14. Re:Parallel is here to stay but not for every app by AuMatar · · Score: 4, Insightful

      And how many of those cores are above 2% utilization for 90% of the day? Parallelization on the desktop is a solution is search of a problem- we have in a single core dozens of times what the average user needs. My email, web browsing, word processor, etc aren't cpu limited. They're network limited, and after that they're user limited (a human can only read so many slashdot stories a minute). There's no point in anything other than servers having 4 or 8 cores. But if Intel doesn't fool people into thinking they need new computers their revenue will go down, so 16 core desktops next year it is.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    15. Re:Parallel is here to stay but not for every app by peragrin · · Score: 2, Interesting

      yea why can't you buy 6 ghz cores ? Is it because unless you super cool them you can't clock them that high?

      3.8 ghz P4 was released in 2005. Instead Intel has focused on power savings, and adding cores while to shrink die sizes.

      Quantum computing is a long ways off, heck they can't even get a good Memresistor yet. The advantage we are having is that Memory speeds are finally catching up to processor speeds. Combine that with a memresistor at that speed and Computing will take a whole new direction for efficiencies and speed. However clock speed isn't gong to significantly increase for a while.

      --
      i thought once I was found, but it was only a dream.
    16. Re:Parallel is here to stay but not for every app by cheftw · · Score: 1

      I agree, but I feel obliged to point out that the "speed" of a processor doesn't normally refer to the clock speed. Like was the P4 really "faster" than a fancy opteron?

      --
      Always back up, never back down. ---- Think you're cool 'cos your uid is prime? Take mine, modulo the one digit integers
    17. Re:Parallel is here to stay but not for every app by jbolden · · Score: 1

      Don't forget coprocessors. Imagine if your video card understood video decoding itself and cached....

    18. Re:Parallel is here to stay but not for every app by Tanktalus · · Score: 1

      Define the applications for which parallel processing is not needed. It might be a smaller list than you think. For example, think of a spreadsheet - parallel processing can really help here when trying to resolve all the cells with complex inter-related calculations. I mean, they already need to do tricks to keep them responsive today, trying to recalculate only the stuff that's showing rather than the entire document (all n sheets).

      Word processors? Well, besides having embedded spreadsheets, they also recalculate a lot of stuff in the background, though probably a lot less than a spreadsheet.

      Is it required? Not always. But for suitably complex spreadsheets, I can imagine some accountants wishing there was some parallel processing done.

      Even email programs - I curse at kmail for not being parallel. Sometimes when it runs spamassassin against incoming email, it can get annoyingly slow, and I have a quad-core CPU! At least I can go do other things while waiting for kmail to catch up. If only it could launch spamassassin in alternate threads, that'd be great.

      Newsreader? Loading articles in the background might take up a bit more RAM/disk, but it'd make things super fast as I go through the newsgroup.

      Browser? Loading and parsing pages like /. articles without making firefox unresponsive will be nice (waiting for a version of Chrome for Linux that is at least beta before I try it).

      There are desktop apps that don't need parallel processing. But many would be somewhat to hugely advantaged by going parallel, even if not always for home use.

    19. Re:Parallel is here to stay but not for every app by Beale · · Score: 1

      The Intel compiler can actually use similar code to the SSE vectorisation module to automatically parallise some constructs for OpenMP shared memory parallelism. Similarly, the newer FORTRAN standards contain constructs which are designed such that they can be automatically parallised.

      If you can write either your compiler or your language in such a way that parallel regions can be inferred rather than explicitly specified, you get more performance for very little end-programmer effort.

    20. Re:Parallel is here to stay but not for every app by larry+bagina · · Score: 1

      That's Grand Central (aka Open CL). Microsoft is also including their own implementation in Direct X 11.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    21. Re:Parallel is here to stay but not for every app by Tablizer · · Score: 1

      Any program with a "for" or "while" loop in which the results of one iteration do not depend on the results of the previous iteration, as well as a fair number of such loops...is a candidate for parallelization

      A lot of those kinds of operations can be farmed off to a database or database-like thing where explicit loops are needed less often. The database is an excellent place to take advantage of parallelism because most query languages are functional-like.

      I remember in desktop-databases (dBase, FoxPro, Paradox, etc.) I almost never used arrays because the table-system was so readily available. However, OOP's rise killed off this style and brought back explicit "getNext" loops. If tools became more database-centric again, perhaps parallelism can be added under the hood without the app developer having to manually manage threads. (I just wish SQL was modernized to be more app-friendly).
             

    22. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      ". . . will allow desktop apps to do things we have never imagined . . ."

      You just agreed with your opponent, AuMatar's, point: "Parallelization on the desktop is a solution is search of a problem . . ."

      Someday, maybe, hopefully, I wish I may, I wish I might . . .

    23. Re:Parallel is here to stay but not for every app by MikeBabcock · · Score: 1

      ... and they often run in parallel with other software.

      It blows my mind how many people don't realize their computer is almost always doing more than one thing at a time. A good OS that knows how to schedule the correct processes to the correct processors can give you a good benefit from parallelism without needing to run multithreaded software.

      Certainly if you only ever use one program on your OS it will be minimal, but it will still be there, even if its just your anti-virus software running in parallel to Photoshop.

      --
      - Michael T. Babcock (Yes, I blog)
    24. Re:Parallel is here to stay but not for every app by nurb432 · · Score: 1

      If we all become part of some huge cloud and share our ( mostly mobile ) resources by default, it may apply even to the most lowly of text editors.

      --
      ---- Booth was a patriot ----
    25. Re:Parallel is here to stay but not for every app by MikeBabcock · · Score: 1

      Browser - network threads, layout thread, image compression decoding threads, Javascript threads ...
      E-mail - network threads, parsing and sorting thread, storage thread

      And so on.

      --
      - Michael T. Babcock (Yes, I blog)
    26. Re:Parallel is here to stay but not for every app by tkinnun0 · · Score: 1

      And are those 16 cores going to help with responsiveness? No. If I type the letter 'B' into my IDE, how is that to be broken into 16 or more subtasks that can be run in parallel? There's nothing but raw serial computing power that can help with that.

    27. Re:Parallel is here to stay but not for every app by drinkypoo · · Score: 1

      parallel programming is difficult and not needed for every application.

      It doesn't have to be difficult if the language is designed for it, and it is useful in nearly every application if you have a bunch of cores to utilize.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    28. Re:Parallel is here to stay but not for every app by Meshach · · Score: 1

      Define the applications for which parallel processing is not needed. It might be a smaller list than you think.

      It probably is a small list caused by parallel slowdown. Also I have heard anecdotal comments about certain tasks not being appropriate for parallel systems. That is just a vague note in my mind though...

      --
      "Maybe this world is another planet's hell"
      Aldous Huxley
    29. Re:Parallel is here to stay but not for every app by drinkypoo · · Score: 1

      Around 4 GHz you start running into interesting problems with RF. The market went towards lower power consumption (as you say) and we got more parallelism instead of high clock rates. This will work out fine for a while; a 2 GHz processor of today is worth twice as much as the first processors to reach those speeds, if not more.

      Nobody really knows when/if the next big jump in clock speed is coming, though.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    30. Re:Parallel is here to stay but not for every app by hitmark · · Score: 1

      another option could be that the compiler becomes thread savy, and turns loops into threads as needed...

      --
      comment first, facts later. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
    31. Re:Parallel is here to stay but not for every app by FlyingBishop · · Score: 1

      Moore's law suggests otherwise. When you get up to the standard machine having 32+ cores, even the desktop will start to see the advantage of multithreading, if you can find a way to express these things elegantly. Until alt+tab starts to look truly instant, there's still room to work, and if we've got machines with double the cores they have now, it starts to be worth your while to figure out how to divvy that 1 second task up into 30 one-tenth second tasks. Or something like that.

      Parallel programming may be difficult, but we're kind of past the reasonable stage where there are honest benefits to be had from single-core optimization, and on to the how can I make this (as far as a human is concerned) instantaneous by threading.

    32. Re:Parallel is here to stay but not for every app by shutdown+-p+now · · Score: 1

      A lot of those kinds of operations can be farmed off to a database or database-like thing where explicit loops are needed less often.

      You don't need a database for a list of 1000 records. Besides, a database isn't all magic and fairy dust - it uses the same data structures under the hood.

      On the other hand, using parallelization techniques employed by modern databases more widely (e.g. in queries over in-memory collections) makes a lot of sense.

      However, OOP's rise killed off this style and brought back explicit "getNext" loops.

      There's no connection between OOP and "getNext" - I assume by the latter you mean iterators/enumerators, but if so, they were a common pattern in procedural languages long before OOP.

      It seems that what you really want are query operators available on language or library level, so that they don't have to be hardcoded as loops. First-class functions neatly enable those on library level, and virtually any mainstream language today has something that uses it, either out of the box (Python list comprehensions, C# LINQ etc), or third-party.

    33. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 1, Interesting

      You can make video encoding very parallel, but you reduce the quality (image quality decreases and bitrate increases) because the most efficient use of motion compensation video compression techniques, like the ones used in all MPEG derivatives, requires using the result of processing one entire frame before processing the next frame. In other words, you can make your encoder highly parallel, but you won't get anything resembling the compression and quality of even a video CD

    34. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      Holy crap - have you been on the web lately? Facebook + cnn + foxnews + huffingtonpost + (now) slashdot + hulu maxes out my CPUs to 150% (1.5 cpus) doing absolutely nothing. All that GD flash and background JS. To say nothing of their memory leaks (my laptop becomes useless if I leave these open overnight)

    35. Re:Parallel is here to stay but not for every app by maxume · · Score: 2, Insightful

      The 16 cores help because your operating system is doing hundreds of things all the time, not just waiting for you to type B into the currently active gui.

      --
      Nerd rage is the funniest rage.
    36. Re:Parallel is here to stay but not for every app by jpmorgan · · Score: 1

      Even with the best automatic parallelization tools, typically the performance gains you get are in line with the amount of programmer effort put in.

    37. Re:Parallel is here to stay but not for every app by zuperduperman · · Score: 1

      I think you overlook cause and effect here, and suffer from a lack of imagination. The fact is, we have largely not developed applications that can use parallel processing because we haven't had parallel processing to do it with. Many existing design patterns (eg: MVC or Document / View) are actually contortions of parallel algorithms to deal with the fact that our programming paradigms are not parallel. Once mainstream languages incorporate parallel processing in their core as primitives we may think very differently about many existing computing problems, and probably will see very different designs emerge, and whole new classes of application that we never thought of before.

    38. Re:Parallel is here to stay but not for every app by marm · · Score: 1

      I think you mean you can make it work with normal quality but you won't get linear scaling of performance as the number of processors increases... For prerecorded material you could do a first pass finding keyframes then encode keyframe-to-keyframe sections on different processors, which will probably scale much better as the processor's L2 cache will be useful.

    39. Re:Parallel is here to stay but not for every app by speilberg0 · · Score: 1

      You could get around the reduction in MPEG encoding quality at least by having a server thread working out the GOP ranges and encoding them in separate encoder threads. After that the resultant encoded GOPs just need to be assembled in the correct order.

    40. Re:Parallel is here to stay but not for every app by cryptoluddite · · Score: 1

      Any program with a "for" or "while" loop in which the results of one iteration do not depend on the results of the previous iteration ... We just need the languages not to make coding this way too painful.

      It's not the languages that are the problem, it's the operating systems. Check it:

      1. Start 4 threads because OS said there are 4 cores
      2. Each thread starts doing N/4 iterations of the loop
      3. Each thread does some kind of synchronization to say it exited. It could be just a write, but there needs to be some way to wake up the original thread.
      4. The original thread blocks until all the other threads have completed, then continues operating.

      The best case is that there is lots of overhead creating a new thread, and the other threads have completed before the original one so that it doesn't have to wait.

      The worst case is that the new threads don't get scheduled right away. Maybe flash is using 100% CPU on one core, so one thread might not even start running until one of the others completes. Anyway you end up with several threads not running and being scheduled at some later time. The for loop ends up taking several times longer even though it's running 'in parallel'.

      Now what about a case like this:

      1. ask os to split thread across multiple CPUs. This would be a guarantee that the threads would run immediately on the number of CPUs returned by the call.
      2. run the loops on N cores (however many the OS returns).
      3. each thread 'exits' when it is done. For the last thread to exit, the OS returns to it instead of exiting. Whichever thread finished last continues.

      This doesn't suffer from the threads originally not starting right away and has no real synchronization except in the OS itself. It's awesome for smallish loops. It's also impossible to do with current operating systems.

    41. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      Try a linux flavour. I 'alt-tab'ed around a bit and it was the speed of my screen that slowed the process down. Been that way as long as I remember.

    42. Re:Parallel is here to stay but not for every app by Eskarel · · Score: 2, Insightful

      While technically most servers are somewhat parallel in nature, it isn't really the same sort of thing that these sorts of languages are designed to achieve.

      Servers, for the most part, are parallel because they have to be able to handle a lot of requests simultaneously, so they spin off a new thread or process(depending on the architecture) for each request to the server, do some relatively simple concurrency checking and then run each request, for the most part, in serial. They're parallel because the task being performed is parallel, a web server that could only return the page to one person at a time wouldn't work, not to make them faster or more efficient, or to take advantage of multiple processors. This kind of parallel architecture is relatively simple, because the architecture is defined by the requirements of the project and you just have to ensure that it works.

      Taking something like a video encoder, PC game, compiler, etc and making it run in parallel so that it's faster and can take advantage of modern hardware is a totally different kettle of fish. You have to redesign your core idea so that it works in parallel, you have to turn one discrete task into two or more which can be run at the same time. It's a whole different challenge and one which very few programmers(myself included) seem to be prepared to meet.

    43. Re:Parallel is here to stay but not for every app by ceoyoyo · · Score: 1

      And how many of those cores are above 2% utilization for 90% of the day?

      It doesn't matter. Nobody cares how utilized their cores are for most of the day. The question is, how hard do they work when the user comes along and decides to do something that requires them.

      Decoding 1080p H.264 video is something I could use an extra core or two for. That's not such an esoteric task.

    44. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      If making software run in parallel is a performance optimization, why not treat it just like any performance optimization, there to be implemented only for performance-critical pieces of code? If the need is there, coders will have the motivation and the pay to implement it. All we need is simplified message passing.

    45. Re:Parallel is here to stay but not for every app by Tablizer · · Score: 1

      Besides, a database isn't all magic and fairy dust - it uses the same data structures under the hood.

      Yes, but because most relational query languages are based on functional-like techniques, they are fairly easy to split up to multiple processors without affecting the outcome (as defined). Unlike your suggestion, we don't have to change our gazillian app programming languages for such.

      It seems that what you really want are query operators available on language or library level, so that they don't have to be hardcoded as loops. First-class functions neatly enable those on library level, and virtually any mainstream language today has something that uses it, either out of the box (Python list comprehensions, C# LINQ etc), or third-party.

      Yes, database-like features. There are other ways to do collection-orientation, but keeping it similar to the DB helps in consistency, scaling, and conversion.
               

    46. Re:Parallel is here to stay but not for every app by cnettel · · Score: 1

      I am not sure how you think that the splitting in itself would ensure immediate scheduling. You basically ask that your thread should be given top priority for an indefinite amount of time. WaitForMultipleObjects and friends in other OSs is as good as it gets, but in that case you only have to wait for the threads to quit in your main thread. The only "real" synchronization is in the OS itself, you only inform the OS of your intent.

      Splitting an existing thread cannot be done too cheaply, you will need to allocate a new stack and so on. The other option is to keep a thread pool where you sacrifice the full generality of threads. Lo and behold, many OSes or low-level libraries also provide readily available thread pools!

    47. Re:Parallel is here to stay but not for every app by cryptoluddite · · Score: 1

      I am not sure how you think that the splitting in itself would ensure immediate scheduling.

      By definition. If the OS can't schedule it immediately on 4 cores, but can on 3 cores then it can split into 3 threads. No OS I know of can do this or has an API to do this.

      Lo and behold, many OSes or low-level libraries also provide readily available thread pools!

      Which you have to signal, then wait for the threads to be scheduled, then wait for the 'main thread' to be signaled that they completed. And you have no idea what else is going on so you don't know how many threads to use in the first place. ... which was kinda the point of my post in the first place.

    48. Re:Parallel is here to stay but not for every app by DarrylKegger · · Score: 1

      I'm not trying to be a smart arse but do you mean "memristor"?

    49. Re:Parallel is here to stay but not for every app by bertok · · Score: 3, Interesting

      The % utilization metric is a red herring. Most servers are underutilized by that metric, which is why VMware is making so much money consolidating them!

      Users don't actually notice, or care, about CPU utilization. What users notice, is latency. If my computer is 99% idle, that's fine, but I want it to respond to mouse clicks in a timely fashion. I don't want to wait, even if it's just a few hundred milliseconds. This is where parallel computation can bring big wins.

      One thing I noticed is that MS SQL Server still has its default "threshold for query parallelism" set to "5", which AFAIK means that if the query planner estimates that a query will take more than 5 seconds, it'll attempt a parallel query plan instead. That's insane! I don't know what kind of users Microsoft is thinking of, but in my world, if a form takes 5 seconds to display, it's way too slow to be considered acceptable. Many servers now have 8 or more cores, and 24 (4x hexacore) is going to be common for database servers very soon. In that picture, even if you only consider a 15x speedup due to overhead, 5 seconds becomes something like 300 milliseconds!

      Ordinary Windows applications can benefit from the same kind of speedup. For example, a huge number of applications use compression internally (all Java JAR files, of the docx-style Office 2007 files, etc...), yet the only parallel compressor I know of is WinRAR, which really does get 4x the speed on my quad-core. Did you know that the average compression rate for a normal algorithm like zip is something like 10MB/sec/core? That's pathetic. A Core i7 with 8 threads could probably do the same thing at 60 MB/sec or more, which is more in line with, say, gigabit ethernet speeds, or a typical hard-drive.

      In other words, for a large class of apps, your hard-drive is not the bottleneck, your CPU is. How pathetic is that? A modern CPU has 4 or more cores, and it's busy hammering just one of those while your hard-drive, a mechanical component, is waiting to send it more data.

      You wait until you get an SSD. Suddenly, a whole range of apps become "cpu limited".

    50. Re:Parallel is here to stay but not for every app by kill+the+white+man · · Score: 1

      right, and 640k ought to be enough for anyone. Some classes of applications do not stand to benefit from parallelism, but there are so many other types of programs that beg for it. Also, just because a system is typically under utilized doesn't mean that it shouldn't be able to perform when necessary.

      I would guess that about 50% of the time, I'm running some CPU intensive computations that completely occupy at least a single core on my machine (usually neuroscience simulations). Please, don't assume that your "average user" needs are the end all for everyone...

    51. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      Most computing intensive problems that a user will encounter at home are easily parallelizable, i.e. video encoding, gaming, photoshop filters, webbrowsing and so on. The amount of times where I maxed out a single CPU and the given problem would not have been to some large degree parallelizable are close to zero.

      Bullshit! Some of these things may be partially parallel, however, many of them only utilise "parallel" (in reality, not parallel, just using multiple CPU's) to improve some aspects. Web-browsing for instance, is not inherently paralizable. You might use separate threads each running/loading via a separate CPU, but that is NOT a parallel operation! Similarly, many processes running on separate CPU's in games are probably not parallel. (IE: sound from one cpu, input processing on another, Internet processes on another CPU, Video on several more CPU's ...)

    52. Re:Parallel is here to stay but not for every app by SlashWombat · · Score: 1

      You can make video encoding very parallel, but you reduce the quality (image quality decreases and bitrate increases) because the most efficient use of motion compensation video compression techniques, like the ones used in all MPEG derivatives, requires using the result of processing one entire frame before processing the next frame. In other words, you can make your encoder highly parallel, but you won't get anything resembling the compression and quality of even a video CD

      Yes, but the 8x8 or 16x16 DCT part of the compression will be much sped up by throwing multiple CPU's at the task.) You do better doing the DCT in hardware. Even significant parts of the motion estimation are better (more quickly) handled by dedicated hardware. You don't really think that the dedicated h264 chips on the market actually use CPU's to do the compression of the video stream? Yet they manage to do a reasonable job on virtually all the imagery thrown at them. (Okay, I do know of one very early (MPEG2 stearing committee) MPEG2 hardware that would re-evaluate parts of the video, go back and encode them differently if the section did not meet the bitrate or quality requirements ... (But strictly speaking, this approach is no longer "baseline" encoding!)

    53. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      Intel also made the sane decission on not focussing in frequency sacrificing anything else. There are other metrics, like instructions per cycle, that are quite important... you know. That's why the Cores were *faster* than the P4 even while being clocked at a way lower GHz. When will the MHz myth disappear completely (even if it is true to a point)?

    54. Re:Parallel is here to stay but not for every app by amn108 · · Score: 1

      Perhaps the technology to implement a Universal Turing machine (transistors today) will expire at one point (it is only logical to assume so), but other technologies that can start from there to continue implementing that same machine will emerge and put into use. What I am saying is that the method may change but the concept will live on. I. e. serial processing will continue to evolve, and who knows where the ultimate limit is? Transistors can only go so small yes, but how about made with quantum-mechanics or DNA-based computers or gods know what else? Even if at one point the Turing completeness will start being a bottleneck itself, another form of serial processing concept will take over and be implemented as hardware, who knows. I do believe serial processors can go a lot faster. There are millions of transistors per mm^2 on a modern CPU die, but there are trillions of atoms in that same area, and who knows what they can do in terms of supporting a serial machine of sorts.

      Of course, the above can and perhaps should coexist with parallel processing concept, the two don't have to negate each other. A fast serial processor is good, but a lot of serial processors working together to solve a problem is even better.

      Perhaps a good cue would be to observe mother Nature and how it gains efficiency by parallelizing anything it can, while still optimizing a multitude of heterogeneous serial processes one by one separately.

    55. Re:Parallel is here to stay but not for every app by grumbel · · Score: 3, Insightful

      Web-browsing for instance, is not inherently paralizable.

      Of course it is. Each Tab can be rendered in parallel, each JPEG can be decoded in parallel, each character can be antialiased in parallel, each download can happen in parallel, all the pixel that make up the final image can be alpha blended in parallel.

      Yes, at some point you hit a wall where you will be left with a piece of code that you have to evaluate sequentially, you can't alpha blend half a pixel and your layouting engine might also require some sequential evalution. My point however is that none of those are CPU intensive enough to be a problem for normal use.

      These days I don't have to wait for my computer because it has to solve some tiny non-parallelizable task, but because my computer is doing some heavy data crunching that could be done in parallel without much of a problem.

      Or to put it another way: Half the for-loops out there just process each element in a list, the major reason why the don't do it in parallel is because the programming language makes it extremely hard to do so, not because there is some fundamental reason why it can't be done.

    56. Re:Parallel is here to stay but not for every app by grumbel · · Score: 1

      It blows my mind how many people don't realize their computer is almost always doing more than one thing at a time.

      My computer is doing many things at the same time, however almost none of them ever get close to even maxing out a single core and those that do don't happen in parallel. For example I don't apply Gimp filters and browse webpages at the same time, I do either one or the other, while the other application is happily idling along. And of course while doing that I curse that Firefox is to stupid to split plugins and tabs across multiple threads and that Gimp isn't great at using multiple cores for its filters either.

      Having multiple cores execute different application might somewhat work with two cores, but with anything larger then that you are basically left with one core maxed out while the other ones idle along. Multicore doesn't bring much benefit unless the applications themselves are clever enough to handel it, the OS doesn't really help there.

    57. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      How true... I work regularly with code 20+ years old. The overwhelming drive in my project right now is to parallelize this aging software. It's a complete mess, costs a ton of money and time, and results in such obfuscation of the code that in my view, it's hardly worth it.

    58. Re:Parallel is here to stay but not for every app by Razalhague · · Score: 1

      2059: "They seriously thought they could get by with just a couple of cores. For a long time they only had one core, and designed their programs with that assumption. Foolish and short-sighted, eh?"

    59. Re:Parallel is here to stay but not for every app by tkinnun0 · · Score: 1

      I have 2GHz dual-core AMD, several years old. My OS may be doing hundreds of task all the time, but those tasks amount to 2-4% of CPU time. I could have all those tasks running on one core and use only 4-8% of that core's time, and literally have the other core waiting for me to press a button. How is adding 14 idle cores to this equation going to improve the responsiveness to my key press? Answer: it's not.

    60. Re:Parallel is here to stay but not for every app by DragonWriter · · Score: 2, Insightful

      And how many of those cores are above 2% utilization for 90% of the day? Parallelization on the desktop is a solution is search of a problem-

      The indication of the need to parallelization isn't the number of cores that are above 2% utilization for 90% of the day, but the number that are above 90% utilization for any part of the day.

      My email, web browsing, word processor, etc aren't cpu limited.

      Some of use our computers for more than that.

    61. Re:Parallel is here to stay but not for every app by Keynan · · Score: 1

      I agree in large part that your right, that the network and other forms of I/O are the biggest speed problem. However, you seem to be over looking a big portion of the desktop market: Gamers.

      3D rendering can take advantage of massive parralization

    62. Re:Parallel is here to stay but not for every app by PitaBred · · Score: 2, Insightful

      What about that other 10% of the day? It's mostly for gamers and developers, but multiple cores really does speed a lot of things up. And they're starting to be quite useful now that Joe User is getting into video and audio editing and so on. Those most certainly are CPU-limited applications, and they are pretty amenable to parallelism as well. Just because you only email, browse the web and use a word processor doesn't mean that's what everyone does.

    63. Re:Parallel is here to stay but not for every app by Nevyn · · Score: 1

      By definition. If the OS can't schedule it immediately on 4 cores, but can on 3 cores then it can split into 3 threads. No OS I know of can do this or has an API to do this.

      I would assume every OS has an API to do this, but it might not be as fast as you want. Certainly on a modern Linux you can look at columns 6 and 7 in /proc/schedstat for each CPU and get a pretty good idea on CPU idleness. But it's never going to be magic, just as it's never going to be as fast to create a thread as it is to call a function.

      I'd also argue that threads are way overused, if you have a specific need and a defined interface ... then create a new program and use shared memory (or whatever) to transfer data. Threads are mostly used due to a lack of design.

      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    64. Re:Parallel is here to stay but not for every app by sjames · · Score: 1

      Not to mention the many apps (especially background apps) that won't benefit in the slightest by being parallelized and the many various network server apps that get all the benefit they need using the old naive fork on connect. For those, per process, the network is the bottleneck. Extra cores mean more clients can be served at once without a slowdown but no amount of parallelization of the per-client process would speed things up. In many cases, that will remain true even given 10Gbps networking.

    65. Re:Parallel is here to stay but not for every app by shoor · · Score: 1

      Dividing and conquering by having several threads processing images or anti-aliasing fonts, is that a 'difficult' problem for parallelization? I don't have experience in this kind of thing, so maybe I'm naive, but it doesn't seem like it would be that hard to divide up a bunch of data and hand out the individual pieces to different threads, which are mostly programmed in the old-fashioned serial way.

      --
      In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)
    66. Re:Parallel is here to stay but not for every app by MikeBabcock · · Score: 1

      Your computer checks for updates online in the background while you're applying a gimp filter and you don't notice a slow-down because you have multiple cores.

      Maxing out cores isn't necessary to reduce latency. Multiple cores are as much about latency as they are throughput.

      --
      - Michael T. Babcock (Yes, I blog)
    67. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      This is wrong. Most tight loops are better off left serialized--or optimized w/ vectoring--because of cache effects. The moment you try to use multiple threads you have to use locks or atomic ops--which also have a significant cost--and you're probably doing more harm than good.

      Writing good code is the same as ever. Keep related code near other related code--both in the source as well as when running on the CPU.

      For semantically unrelated code use abstractions like message passing to keep them separated so that parallelization of those spheres becomes easy.

      If this isn't enough, then you micro-optimize. Perhaps you do find a loop that should run in parallel (probably a high-level loop). If you followed the above rules, then refactoring should be trivial.

    68. Re:Parallel is here to stay but not for every app by DragonWriter · · Score: 1

      There's no connection between OOP and "getNext"

      No, but there is a disconnect between OOP and relational database (at least, as each is most commonly implemented; I'm well aware of the arguments from Date, et al., arguing that the Object-Relational Impedance Mismatch is a result of doing OOP, RDBMSs, or both wrong), which is why the rise of OOP in the form of Java and C++ as dominant languages, set back, or at least slowed the advance of, the use of relational constructs in place of imperative ones, even when the backing store was an RDBMS.

    69. Re:Parallel is here to stay but not for every app by cryptoluddite · · Score: 1

      I would assume every OS has an API to do this, but it might not be as fast as you want. Certainly on a modern Linux you can look at columns 6 and 7 in /proc/schedstat for each CPU and get a pretty good idea on CPU idleness.

      That's not an "API", that's a hack workaround. Open a file, have kernel format the lines, read the lines, parse the lines, figure out how many CPUs to use, create a thread, use sched_* syscalls to assign it to a CPU, ... now CPU is busy again. Whoops. Even if you're using a thread pool it's so clumsy you might as well just assign it to N threads to begin with (where N is any arbitrary number). On top of that, it only works for one program at a time in the entire system (well best case it might be described by some kind of differential equation).

      As far as I know, no OS has an API to do anything close to what I suggested in the post, a way to do fine-grained parallelism well. Maybe on a mainframe... Tera MTA could do this, but it had special hardware threads built into the CPU itself.

      I've done some kernel work myself, and would try this, but it's not my job. And getting something like this past Ingo...

    70. Re:Parallel is here to stay but not for every app by ioshhdflwuegfh · · Score: 1

      I have 2GHz dual-core AMD, several years old. My OS may be doing hundreds of task all the time, but those tasks amount to 2-4% of CPU time. I could have all those tasks running on one core and use only 4-8% of that core's time, and literally have the other core waiting for me to press a button. How is adding 14 idle cores to this equation going to improve the responsiveness to my key press? Answer: it's not.

      Exactly. So what you need is not more or faster cores, but something slower. Get yourself an Atom or something, save electricity and let us crunchers crunch our stuff on multi-core machines.

    71. Re:Parallel is here to stay but not for every app by ioshhdflwuegfh · · Score: 1

      [...]the only parallel compressor I know of is WinRAR, which really does get 4x the speed on my quad-core.

      pbzip2 (parallel bzip2) scales just as well.

    72. Re:Parallel is here to stay but not for every app by Anonymous Coward · · Score: 0

      "or any database"
                Funny you should mention that.. SQL Server is crappy enough, I know someone fighting withit because they went to multi-core systems (with slightly slower cores compared ot their older single-core systems.) Slow down! SQL Server is *SINGLE THREADED*

                Anyway...the fact of the matter is, some jobs do naturally split up, but the general-purpose programming languages do NOTHING to help. Fork-and-forget's the easiest way to take advantage of cores, anything tighter coupled a programming language designed for mutlprocessing helps a lot, but it's nonstandard too.

  3. Forget "Love" by 800DeadCCs · · Score: 1

    All you need is Fortran.

    1. Re:Forget "Love" by Anonymous Coward · · Score: 0

      Wow, a Beatles reference in a post supporting Fortran. I think that gives /. a good idea of what your age range is.

  4. We need async frameworks too! by PhrostyMcByte · · Score: 4, Insightful

    A lot of problems are I/O driven -- I would like to see more database client libraries allow a full async approach that lets us not block the threads we are trying to do concurrent work on.

    1. Re:We need async frameworks too! by Hurricane78 · · Score: 1

      Doesn't the Haskell compiler automatically figure out, how to spread things to threads and when to do what? I thought that was the point of declarative programming.

      --
      Any sufficiently advanced intelligence is indistinguishable from stupidity.
  5. What's so hard? by 4D6963 · · Score: 2, Interesting

    Not trying to troll or anything, but I'd always hear of how parallel programming is very complicated for programmers, but then I learnt to use pthread in C to parallelise everything in my C program from parallel concurrent processing of the same things to threading any aspect of the program, and I was surprised by how simple and straightforward it was using pthread, even creating a number of threads depending on the number of detected cores was simple.

    OK, maybe what I did was simple enough, but I just don't see what's so inherently hard about parellel programming. Surely I am missing something.

    --
    You just got troll'd!
    1. Re:What's so hard? by sneilan · · Score: 1

      The hard part is getting a program that's not meant to be parallelized, parallelized. You may have been working on a program that can be broken into discrete chunks easily. Threads are a performance hack and people are trying to squeeze that last bit of performance out.

      --
      "I like it when the red water comes out.."
    2. Re:What's so hard? by Anonymous Coward · · Score: 2, Insightful

      Do your programs ever leak memory? Did you have to work with a team of 100+ SWE's to write the program? Did you have technical specs to satisfy, or was this a weekend project? This is the difference between swimming 100 meters and sailing across the Pacific.

    3. Re:What's so hard? by beelsebob · · Score: 4, Informative

      It's not creating threads that's hard - it's getting them to communicate with each other, without ever getting into a situation where thread a is waiting for thread b and thread b is waiting for thread a that's hard.

    4. Re:What's so hard? by Anonymous Coward · · Score: 1, Insightful

      How many lines does it take you to parallelize this with pthreads in C?

      for (i = 0; i < 1000; i++)
          c[i] = a[i] + b[i];

      If it takes you more than 2 lines, then your "language" is too hard to be used everywhere by everyone.

    5. Re:What's so hard? by ponraul · · Score: 1

      Yeah. It's easy creating threads.

      However, if the threads then need to share data with among themselves, or one class of threads has to wait for another class of threads, or you have threads competing for finite resources, or you need to share state among different threads it's not so easy.

      The hard part comes from using threads correctly. In school, they teach you how to prove a particular piece of concurrent code cannot deadlock. In real life, you have to be risk adverse.

      Better concurrency primitives make for less error prone concurrent code. However, better concurrency primitives are generally more restrictive in the kinds of problems they can easily solve. Something that might be trivial to do with semaphores isn't so easy with monitors.

    6. Re:What's so hard? by Anonymous Coward · · Score: 4, Funny

      How many lines does it take you to parallelize this with pthreads in C?

      One. Newline characters are for wimps.

    7. Re:What's so hard? by Unoti · · Score: 3, Interesting

      The fact that it seems so simple at first is where the problem starts. You had no trouble in your program. One program. That's a great start. Now do something non-trivial. Say, make something that simulates digital circuits-- and gates, or gates, not gates. Let them be wired up together. Accept an arbitrarily complex setup of digital logic gates. Have it simulate the outputs propagating to the inputs. And make it so that it expands across an arbitrary number of threads, and make it expand across an arbitrary number of processes, both on the same computer and on other computers on the same network.

      There are some languages and approaches you could choose for such a project that will help you avoid the kinds of pitfalls that await you, and provide most or all of the infrastructure that you'd have to write yourself in other languages.

      If you're interested in learning more about parallel programming, why it's hard, and what can go wrong, and how to make it easy, I suggest you read a book about Erlang. Then read a book about Scala.

      The thing is, it looks easy at first, and it really is easy at first. Then you launch your application into production, and stuff goes real funny and it's nigh unto impossible to troubleshoot what's wrong. In the lab, it's always easy. With multithreaded/multiprocess/multi-node systems, you've got to work very very hard to make them mess up in the lab the same way they will in the real world. So it seems like not a big deal at first until you launch the stuff and have to support it running every day in crazy unpredictable conditions.

    8. Re:What's so hard? by quanticle · · Score: 2, Insightful

      Making a threaded application in C isn't difficult. Testing and debugging said application is. Given that threads share memory, rigorously testing buffer overflow conditions becomes doubly important. In addition, adding threading introduces a whole new set of potential errors (such as race conditions, deadlocks, etc.) that need to be tested for.

      Its easy enough to create a multi-threaded version of a program when its for personal use. However, there are a number of issues that arise whenever a threaded program interacts with the (potentially malicious) outside world, and these issues are not trivial to test for or fix. That's why I think that parallel programs are going to be increasingly written in functional programming languages (Common Lisp, Haskell, Scala, etc.). The limitations on side effects that functional languages impose reduces the amount of interaction between threads, and reduces the probability that a failure in a single thread will propagate through the entire application.

      --
      We all know what to do, but we don't know how to get re-elected once we have done it
    9. Re:What's so hard? by Anonymous Coward · · Score: 1, Insightful

      That's what libraries are for. This is what you youngins continually forget. Design and implement properly libraries of useful code ONCE and use them many times.

      You could easily write a "vector_add" function which spawns (or unlocks pre-spawned which is smarter) threads to perform a variety of tasks. Then from your application a single line of code would perform an optimized parallel vector addition or whatever.

      In fact, a smart DSP lib today would do just that. Pre-spawn a bunch of threads then host a job server which unlocks threads to work on given tasks. That way you have single line functions like vector_add(in1, in2, out, size), etc...

      So you could actualy write really easy to read parallel programs in C. You just have to know the first thing about software development.

      In short, your rant is a product of not knowing what you are doing.

    10. Re:What's so hard? by synaptik · · Score: 1

      even creating a number of threads depending on the number of detected cores was simple.

      Are you guaranteed that those spawned threads will be evenly distributed amongst the cores, on a given architecture? There's also a matter of locality; you want the threads that are dealing with certain data to run on cores that are close to that data.

      MT is not the same thing as MP. You may have written a multi-threaded app, but when on a single-core you likely didn't see any perf gains. MT apps on a single CPU core can have benefits-- such as, your UI can remain responsive to the user during serious number crunching-- but at 100% CPU load, this necessarily comes with the cost of your number-crunching taking longer.

      MP scalability in software is hard, because you don't know (and shouldn't assume) how many CPU cores are present in the user's system. So, you have to give considerations to:
      * What aspects of your software's workload is independent, and parallelizable
      * How coarsely or finely you should parallelize the work (as a runtime decision,) based on the number of CPU cores present.

      It's also hard because you have to forgo simplicity that you could have had with a single-threaded implementation, even when there is only one CPU core in the user's system.

      --
      HSJ$$*&#^!#+++ATH0
      NO CARRIER
    11. Re:What's so hard? by Yacoby · · Score: 2, Interesting

      Data communication in a foolproof way. Writing a threaded program is easy if the program is simple. You can even get a bit more performance out of a program using multiple threads if you use locking. If you use locking, you end up with the possibility of race conditions, deadlock and other nightmares.

      Extending this to something like a game engine is much harder. Say we split our physics and rendering into two threads. How does the physics thread update the render thread? We could just lock the whole scene graph, but then we don't get much of a performance increase, if at all. We then could use two buffers. The renderer renders the data from one, and the physics thread updates the other. When we are ready to update the frame, we just swap the buffers. Then we end up with some input lag. There are still complications. What happens if we add an AI thread. How does that add data to the buffer in a way that doesn't conflict with the physics thread?

      We could use lock free lists, which are very hard to get right. Even some implementations that I have seen end up locking the heap, which we want to avoid. But even then we end up with some issues.
      Don't get me started on debugging threaded applications. Finding that while it works fine on one and two cores. 0.1% of the time on a quad core there is a deadlock.

      So to sum it up. Anyone can write a threaded application where it is easy to split the tasks. If you are designing it from the ground up, it is even easier. If you need to write performance critical maintainable code that involves a lot of communication, it suddenly gets much harder.

    12. Re:What's so hard? by Anonymous Coward · · Score: 0

      then your "language" is too hard to be used everywhere by everyone.

      Perhaps not 'everyone' is meant to be a programmer...

      Let the real men program in parallel

    13. Re:What's so hard? by Anonymous Coward · · Score: 1, Interesting

      I guess this is where 'restrict' comes in. If a, b and c can be determined as aliases and non-overlapping, the compiler may auto-vectorise that for you on an appropriate architecture.

      That said, in Handel C, a dying dialect of C which targeted FPGA would let you do the following:

      par (i = 0; i < 1000; i++)
      c[i] = a[i] + b[i];

      This would build a massive amount of logic to perform the 1000 adders in parallel on an FPGA, but it's nice syntax. The par could also be replaced with seq to make a sequential version (sill using lots of logic since it seq is like an unrolled loop).

    14. Re:What's so hard? by grumbel · · Score: 1

      Implementing threading in a new app written from scratch isn't that hard (even so it has quite a bit problems on its own), the real troublesome part is rewriting legacy code that wasn't build for threading, as that often makes a lot of assumptions that simply break in threading.

    15. Re:What's so hard? by johannesg · · Score: 1

      Not trying to troll or anything, but I'd always hear of how parallel programming is very complicated for programmers, but then I learnt to use pthread in C to parallelise everything in my C program from parallel concurrent processing of the same things to threading any aspect of the program, and I was surprised by how simple and straightforward it was using pthread, even creating a number of threads depending on the number of detected cores was simple.

      Really? With the pthread API? Pray tell, how does that work?

      Note that reading from /proc/ is neither part of the pthread API, nor portable...

    16. Re:What's so hard? by Unoti · · Score: 2, Insightful

      You could easily write a "vector_add" function which spawns (or unlocks pre-spawned which is smarter) threads to perform a variety of tasks.

      AC here is right, but also missing something. I often hear about things being "easy". Often when people say something is "easy" they really mean, "it's easy after it's done." This is one of those things where it's only easy after it's done. The code might look easy after it's created and debugged. But getting to the point where it's created and validated and debugged is much harder in some languages and approaches (e.g. C) than it is in others (e.g. Erlang).

      Take someone experienced with multithread, multi process, multi node programming in C. And put them up against someone experienced with same in a language designed for distributed systems. Have them drag race on producing code that expands to an arbitrary number of processes and computers and evenly distributes the load among them in a fault tolerant, smooth way. The person in a language designed for it is going to blow the doors off the person doing it in C in terms of productivity. And that's what these languages are all about.

      If you've got a library already present that does exactly what you need, great, and AC is right on target there. But when you don't have such a library, and you almost never will, then it's great to use a tool that makes the job easy to do well and do quickly.

    17. Re:What's so hard? by burnetd · · Score: 1

      Of the top of my head, one....

      #pragma omp for schedule(static)

      while on OSX that'll use pthreads, and I believe on Linux too, on Windows it'll depend on the compiler used.

    18. Re:What's so hard? by Anonymous Coward · · Score: 0

      Assuming that you want to add everything:
      (pmap + a b)

      Or just the first 1000 items:
      (pmap + (take 1000 a) (take 1000 b)) /Clojure ftw. :p

    19. Re:What's so hard? by jbolden · · Score: 1

      An example is the locking problem on variables that are shared. Which variables get locked, for how long? How does the lock get released? To many locks you run sequentially, too few you corrupt your threads.

    20. Re:What's so hard? by Vanders · · Score: 1

      How does the physics thread update the render thread?

      Use asynchronous message passing. No, really. There is no implicit locking involved in such a scheme. Take a look at the BeOS or Syllable GUI APIs for an example.

    21. Re:What's so hard? by Lord+Crc · · Score: 1

      OK, maybe what I did was simple enough, but I just don't see what's so inherently hard about parellel programming. Surely I am missing something.

      For me, the two things that are hardest are designing an efficient parallel algorithm for the target platform and ensuring fast but proper synchronization.

      For instance, if your target is a GPU, then you have a bunch of execution units in parallel, but communication between them is limited. You have to take this into consideration when designing the algorithm.

      If your target is regular CPU's, then you might have a handful for execution units and communication can be fast. However you need to ensure proper synchronization. Locks can be very expensive on some platforms, and so you want to reduce the use of them, especially in highly congested parts of your application. However this can be very difficult to do correctly, so that there aren't any race conditions or data corruption.

      In general, using threads can be very easy however for all but the most trivial issues it can be a bit tricky to do it efficiently, so that it's actually worthwhile to use threads. Imho.

    22. Re:What's so hard? by Salamander · · Score: 1

      Spawning threads to handle isolated tasks within a single address space isn't all that hard. Handling interrelated tasks across more processors than could possibly share one address space, doing it correctly so it doesn't have deadlocks or race conditions, distributing the work so there aren't performance-killing bottlenecks even in a non-trivial system topology handling an ever-varying work distribution, etc. . . . that's where things get just a bit more challenging, and what the newer crop of languages are supposed to handle. Personally I don't think they help all that much with more than a small class of problems (mostly those that are heavily oriented toward dealing with some kind of regular array), but it's silly to dismiss them without even understanding their problem domain.

      On a slightly different note, BTW, parallel programming hasn't become more difficult. It has merely become more common. Whereas it used to be the domain of a few with proper training, now a whole bunch of barely-competent schleps are trying to do it as well. They see pthread_create and it blows their mind, and they think now they're masters of parallel programming, hardly aware that creating threads is to parallel programming what the "if" statement is to programming in general - the very first door you have to go through, before you can even realize how much more there is to know.

      --
      Slashdot - News for Herds. Stuff that Splatters.
    23. Re:What's so hard? by Beale · · Score: 1

      Not to sound like a fanboy, but <3 OpenMP.

      (Though it's not exactly using pthreads directly, even though the majority of implementations are built on pthreads.)

    24. Re:What's so hard? by 4D6963 · · Score: 1

      MP scalability in software is hard, because you don't know (and shouldn't assume) how many CPU cores are present in the user's system.

      Well actually what I did (and what I think should be done) is to do just like with anything else and design it based on a variable, i.e. you know what you want if there's one core, you know what you want if there's 2, or 8, or 64, and based on that you write an algorithm that takes the number of detected cores into account and behaves as you want it to.

      --
      You just got troll'd!
    25. Re:What's so hard? by ivan_w · · Score: 1

      But when you don't have such a library, and you almost never will, then it's great to use a tool that makes the job easy to do well and do quickly.

      But when you don't have such have a tool, and you almost never will, then it's great to use a library that makes the job easy to well and do quickly.

      My point is : what's the difference between a "tool" and a "library" ?

    26. Re:What's so hard? by Timothy+Brownawell · · Score: 1

      Not trying to troll or anything, but I'd always hear of how parallel programming is very complicated for programmers, but then I learnt to use pthread in C to parallelise everything in my C program from parallel concurrent processing of the same things to threading any aspect of the program, and I was surprised by how simple and straightforward it was using pthread, even creating a number of threads depending on the number of detected cores was simple.

      Really? With the pthread API? Pray tell, how does that work?

      Note that reading from /proc/ is neither part of the pthread API, nor portable...

      #include <unistd.h>
      int num_cores = sysconf(_SC_NPROCESSORS_ONLN);

    27. Re:What's so hard? by 4D6963 · · Score: 1

      How does that add data to the buffer in a way that doesn't conflict with the physics thread?

      Well, I don't think that's that complicated actually. Make the physics thread get all the data from the AI thread all at once, let it perform its little loop, let it take AI data again, repeat... That's why I said it seems simple, of course it's not always simple, but as far as my limited experience goes all you have to do is find an elegant way to do this on paper and it all works out. I for one am not a big fan of locking, I try as much as possible to use mutex free code and just design things so that threads can keep doing their thing blindly without ever waiting. Well perhaps that's not always possible, but I think it's possible in most cases.

      As for the physics-rendering problem, I think the most elegant solution is to use double buffering as you said, but to avoid the problem of the rendering loop starting just before the physics one ends, perhaps you can use some time measurement to determine whether the rendering loop should start or if it's late enough in the physics loop for it to wait for it.

      I for one have never met any problems with debugging using GDB, and I think that if you get deadlocks on quad cores then there's something wrong about your design to begin with, i.e. you didn't plan for N cores correctly.

      This being said you're absolutely right about it being easy if you design it from the ground up. Actually I'd consider turning a complete single threaded program into a parallelised program to be madness, not that you always have the choice though..

      --
      You just got troll'd!
    28. Re:What's so hard? by Anonymous Coward · · Score: 1, Insightful

      Simple answer: synchronized shared data access.

      Technically creating the threads is the easy bit. Making sure that the threads don't fight over shared mutable data is so hard that most people give up. Frinstance, "serializable" isolation is the only really guaranteed isolation level for database transactions under all circumstances, but it can turn your DB into a uniprocessor; most people avoid it. Java's Swing UI gives up on data locking, and just says "only my thread touches the UI" and provides API to book tasks on it.

      There exist a lots of cases where threaded parallelism is easy to implement. But many cases are hard, with subtle pathological difficulties. Once you bundle a few of these cases into the same system, correctness becomes impossible to estimate. Add to that the fact that many of the bugs are intermittent, debugger-resistant (schroedingbugs!) and potentially fatal (data corruption, when you encourage liveness; deadlocks, when you vote for safety), and you have some seriously difficult problems.

    29. Re:What's so hard? by 4D6963 · · Score: 1

      Who said I used nothing but the pthread library, or nothing but standard calls? If you want to write a portable program and you want to write more than a command line tool best believe you'll use platform dependent #ifdefs no matter what.

      Like it matters, it's just one call to make at the beginning of the program.

      --
      You just got troll'd!
    30. Re:What's so hard? by spiffmastercow · · Score: 1

      How many lines does it take you to parallelize this with pthreads in C?

      for (i = 0; i < 1000; i++) c[i] = a[i] + b[i];

      If it takes you more than 2 lines, then your "language" is too hard to be used everywhere by everyone.

      Point 1: This is an absurdly simple example
      Point 2: assuming 1000 divides cleanly into variable 'threadcount' and you have a function called 'start_the_thread(int *, static int*, static int*, size)...

      for (i = 0; i < 1000; i += (1000 / threadcount))
      start_the_thread(c+i, a+i, b+i, 1000/threadcount);


      Maybe it doesn't look clean, but it is 2 lines. And I'm sure someone else could do it a lot better that that.

    31. Re:What's so hard? by Anonymous Coward · · Score: 0

      Those who can multithread, do. Those who can't, whine about it on Slashdot.

    32. Re:What's so hard? by Beezlebub33 · · Score: 1

      Why does it need a library? why isn't this the compiler's job?

      Does an operation have a side effect? If not, then it can happen in parallel with other operations that are working on different data. In this case, there isn't a side effect (unless you're in C++ land and you've changed the meaning of +) and each operation occurs on a different pieces of data. So, if you have 1000 cores, they should all happen at the same time. the problem is that 1) you are going to be I/O bound and 2) this is a trivial example and in general it's much harder to determine if there are side effects. Which is why C sucks for parallelism and Erlang rocks.

      --
      The more people I meet, the better I like my dog.
    33. Re:What's so hard? by Anonymous Coward · · Score: 0

      There are several classes of parallel programming

      The 'stupidly paralizable ones'. These are large chunks of data where the results of one iteration does not really effect the next iteration. So you can really fire up as many threads as you have processors and see a nice speedup. Things like this are ray tracing, searching, sorting, etc

      Some interdependence. These are where each thread can run on its own. Doing its own thing. But sometimes you have to spin wait on some other thread to finish whatever it is doing. This sort of class is usually wrapped around a blocking call such as waiting on data from a TCPIP port. With some sort of parent thread waiting around for results or doing other things like drawing the screen.

      Massive interdependence. These are where there are global variables and at any time some thread can be using them. So you need to lock around them. You can end up with locking trains. Where the parallel code actually just ends up acting like serial code. In these cases it can actually be slower. This is the class what people think of when they talk about 'languages that just do it for them'. They are not quite sure what they are asking for and just want the 'compiler to do it for them'. Which is noble and all but at this time does not work well unless well thought out.

      The hard part comes from locking interdependence. Such as thread 1 locks a then lock b. Then thread 2 locks b then lock a. In some cases you can end up with a deadlock (this case I would say it is assured at some point).
      Weird memory issues. For example in Win32 you really should not share handles between threads but it still works. It will just suddenly start acting wonky. Why because of the way memory is allocated for the thread. Which is an implementation specific detail but each threading library or system has its own quirks and tribal knowledge rules.

      Race conditions and memory inconstancy is why threading is hard. With small simple programs it is pretty easy to keep straight in your head. But as systems grow and time goes by it becomes hard to keep it all straight without introducing new inconsistencies into the state of the program.

      Others have pointed out that communication is also an issue. That is usually in the form of some sort of lock/semaphore/queue. In some cases the communication between threads can overwhelm what you are doing and actually make things worse.

      It all really at this point in time using the thing that holds your ears apart when using threads and watching out for places where things can become inconsistent.

      Adding threading in is EASY. Making sure it doesnt totally trash your program. Now thats hard...

    34. Re:What's so hard? by Anonymous Coward · · Score: 0

      If you've got a library already present that does exactly what you need, great, and AC is right on target there. But when you don't have such a library, and you almost never will, then it's great to use a tool that makes the job easy to do well and do quickly.

      "Almost never"? OpenMP is pretty much standard!

    35. Re:What's so hard? by kevinNCSU · · Score: 1

      And add on to that the fact that you need to make sure the results of Thread A and B are deterministic meaning that the threads won't produce different results depending upon which thread is given cycles first. See http://en.wikipedia.org/wiki/Race_condition I know that's where waiting and therefore deadlock problems come in but it is usually these race conditions that people overlook.

    36. Re:What's so hard? by vikstar · · Score: 1

      >> c(0:999) = a(0:999) + b(0:999)

      Matlab.

      --
      The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
    37. Re:What's so hard? by Anonymous Coward · · Score: 0

      You forgot to show the code for "start_the_thread".

      So in general to parallelize a program you're going to sift through it, find each statement that is parallelizable, write a one-off function that implements it inside of a thread, and replace the original statement with a call to that one-off function?

      That's insane. No wonder people are asking for better tools.

    38. Re:What's so hard? by johannesg · · Score: 1

      Ah, thanks, that is useful to know. Certainly a lot more useful than "just make it system dependent" like the original poster apparently did...

    39. Re:What's so hard? by mrlibertarian · · Score: 1

      Multi-threading is hard when your threads share memory with each other. At first, it seems easy enough, but as your program becomes more complicated, you'll have to use more mutexes. The more mutexes you have, the more likely it is that you will run into dead-lock. And even after your program is working, you'll be afraid to modify certain parts of it, because you know that one wrong move could lead to a race condition or a dead lock.

      So, what's the solution? In my opinion, whenever you're writing a complicated, multi-threaded program, you need to use the actor pattern. In other words, instead of sharing memory, have your threads (i.e. actors) send messages to each other. Suddenly, everything is pretty simple. You don't have to worry about dead-lock anymore. You don't have to worry that some other thread is going to mess with one of your structures. For me, the actor pattern was a life-saver.

      Sure, I still share memory here and there, but whenever the program threatens to get away from me, I use the actor pattern to drive away the complexity.

    40. Re:What's so hard? by Anonymous Coward · · Score: 0

      IN C#/.net 4.0 it will not be much longer. Use Parallel.For() for that.

    41. Re:What's so hard? by Ihlosi · · Score: 1
      Not trying to troll or anything, but I'd always hear of how parallel programming is very complicated for programmers,

      Yeah, yeah, any idiot can do parallel programming. The hard parts are a) finding ways to make parallel programming efficient and b) debugging such a program, since you have to deal with all kinds of concepts that never occur in a single-threaded program (reentrancy, deadlocks, locking mechanisms, etc).

    42. Re:What's so hard? by Anonymous Coward · · Score: 0

      100+ SWEs ? Sounds suspiciously like sweatshop programming to me. Like the last mess I cleaned up where 30+ "SWE"s worked for a year on a program that had to be gutted and the code thrown away. For one use case I threw away 60,000 lines of SQL code that would NEVER work and replaced it with 400 lines of code that not only worked, but executed (accurately) 10,000 times faster.

      Teams writing major ERP systems don't have that many software engineers working. Maybe just work with 20 really good software engineers instead?

      Maybe the problem isn't the coding language.

    43. Re:What's so hard? by jknapka · · Score: 1

      The truly difficult thing is to make parallelization completely automatic -- you write the code in a "natural" way, and the compiler and/or runtime environment figure out how to do the parallelization. This was a major driving force behind the development of declarative and functional programming languages like Prolog and ML -- when your language doesn't permit side-effects or destructive assignment, then operations can be executed in any order you like, including simultaneously in many cases. You just need to respect data-flow -- ensure that computations that depend on one another's results are done in the appropriate order. In the absence of destructive assignment, data-flow is a lot easier to manage. For example, Prolog queries can be parallelized automatically as long as certain language constructs are avoided, and the Parlog system achieved that to some extent. This will never be possible with C, because C encourages you to use side effects that cannot be optimized away.

    44. Re:What's so hard? by mkramer · · Score: 1

      In addition to the other commentaries:

      Threading only applies to a subset of parallizable cases, on a subset of computing architectures.

      You're not going to use threads to decompose a for loop doing simple non-dependent mathematic operations over a large vector. That's why the ugly vector operations were added to C for Alti-Vec/MMX/SSE/etc. It's another type of problem.

      As long as software development is stuck in a cycle of having to invent new programming solutions for every new computing architecture, we're not going to see a whole lot of variance/advancement in computer architectures.

      What is desperately being sought now is a much more generic way of indicating parralelism in code, that a compiler can then parralelize or no parallelize using any number of techniques, depending on the specific architecture it's compiling to.

      People talk about fancy super-intelligent compilers, but that's not really what we're after now. They key is "merely" being able to concisely indicate data relationships (foremost dependency, but other attributes as well), which will open the door for any number of hardware and software innovations.

    45. Re:What's so hard? by sjames · · Score: 1

      Take one of those threads and make it into 10 that run in parallel. Threading at natural boundaries is easy. The hard part is when you do that and still want to utilize more CPUs.

      Take a for loop that goes from (for example 0-999). Now spilt it into 4 threads that together cover that space. You could have thread 1 do 0-249, etc or you could interleave them (so thread one covers 0,4,8,12 while thread 2 does 1,5,9,13, etc). Now do it without obfuscating the flow of your code.

      If the computation of N depends on the value of N-1, you're screwed unless you can somehow re-arrange the computation cleverly until that is no longer true. Sometimes you can, sometimes you can't. For example, you might be able to use 2 threads in lockstep such that a partial value of N based on independent variables is computed while the computation of N-1 is completed. If your locking is really efficient and you don't blow the cache out, that might even work. You could up to double the speed that way. Otherwise, you could easily end up slower than the serial implementation. Recognizing the difference before you spend hours (or days) experimenting can be really hard. Making sure your experiment will behave the same way with real world data can be interesting as well.

      In other cases, a coarser grained parallelism will be the best you can hope for. The strategy there is to isolate the inter-thread dependencies to the border cases. At the borders, once again you'll need efficient locking. You'll also need to compute the optimum number of threads to use for a given array of data. Use too many or too few and you slow down. At the extremes on either end, you run slower than just having a single thread do it.

    46. Re:What's so hard? by beej · · Score: 1

      I think that if you get deadlocks on quad cores then there's something wrong about your design to begin with, i.e. you didn't plan for N cores correctly.

      Perhaps. I'm going to quote from the paper The Problem With Threads here:

      Part of the Ptolemy Project experiment sought to determine whether we could develop effective software engineering practices for an academic research setting. We developed a process that included a four-level code maturity rating system (red, yellow, green, and blue), design reviews, code reviews, nightly builds, regression tests, and automated code coverage metrics. We wrote the kernel portion that ensured a consistent view of the program structure in early 2000, design reviewed to yellow, and code reviewed to green. The reviewers included concurrency experts, not just inexperienced graduate students.

      We wrote regression tests that achieved 100 percent code coverage. The nightly build and regression tests ran on a two-processor SMP machine, which exhibited different thread behavior than the development machines, which all had a single processor.

      The Ptolemy II system itself became widely used, and every use of the system exercised this code. No problems were observed until the code deadlocked in April 2004, four years later.

      Our relatively rigorous software engineering practice had identified and fixed many concurrency bugs. But that a problem as serious as a deadlock could go undetected for four years despite this practice is alarming. How many more such problems remain? How long must we test before we can be sure to have discovered all such problems? Regrettably, I must conclude that testing might never reveal all the problems in nontrivial multithreaded code.

      I've learned quite a bit about programming in Erlang recently, and have grown quite fond of no shared state, superlight processes, and message passing.

    47. Re:What's so hard? by ioshhdflwuegfh · · Score: 1

      >> c(0:999) = a(0:999) + b(0:999) Matlab.

      or, even simpler:
      +

      (APL)

    48. Re:What's so hard? by 4D6963 · · Score: 1

      That sounds like a good approach indeed. If you have the luxury of having all the flexibility possible to make a multithreaded program then if you do things right I'm sure that in most cases you can come up with a design that will avoid the issues commonly met in parallel programs.

      --
      You just got troll'd!
    49. Re:What's so hard? by 4D6963 · · Score: 1

      Sounds like you're talking about something like Apple's Grand Central Station thing. If I'm not mistaken, instead of creating threads, you indicate which parts of your code can be parallelised, and the Grand Central thing takes care of parallelising it, with even less overhead than it takes to create a thread.

      This being said, current threading implementations such as pthread aren't the most elegant way to do those things, but I still see working with threads something that's an integrating part of a program's high level algorithm, that is, if I make a flow chart of my program, well threads will be the things that run together simultaneously, and from what I decide to do it makes sense to use those. But this being said, all I know is threads, so of course I'm not going to think using a more elegant and sophisticated paradigm.

      --
      You just got troll'd!
  6. GCC OpenMP by Anonymous Coward · · Score: 0

    Agree. Adding a new language isn't going to help much. Maybe extending an old one is the best way to go. However, GCC has added (if you choose) a small subset of parallel programming constructs automatically at the compiler level for at least a year now.

  7. Re:Chapel? by ponraul · · Score: 1

    Huh? What spoken language is Perl supposed to be?

  8. Old languages designed for parallel processing? by number6x · · Score: 4, Informative

    Erlang is an older established language designed for parallel processing.

    Erlang was first developed in 1986, making it about a decade older than Java or Ruby. It is younger than Perl or C, and just a tad older than Python. It is a mature language with a large support community, especially in industrial applications. It is time tested and proven.

    It is also Open source and offers many options for commercial support.

    Before anyone at DARPA thinks that they can design a better language for concurrent parallel programming then I think they should be forced to spend 1 year learning Ada, and a second year working in Ada. If they survive they will most likely be cured of the thought that the Defense department can design good programming languages

    1. Re:Old languages designed for parallel processing? by Anonymous Coward · · Score: 1, Interesting

      Labview by National Instruments can do parallel programming automatically. I myself find the coding fairly straightforward as well. It's problem is performance. Well thought out C++ seems to easily win the performance war, especially when you deal with all the buffer copies Labview has a tendency to toss in. Currently I've been using a mix of Labview and C++(Dll's) which seems to offer some of the benefits(debugging/performance/flexibility) of each, although I'm not sure that combining C# and C/C++ wouldn't have been an even better solution.

      In particularly, I think the two biggest flaws in existing Labview are limitations on efficient string manipulation, as well as no true inlined functions. Labview does have subroutines, but that isn't really the same. Some C++ can get around such things, but it is not an ideal solution. Still, if you need to put together things _fast_ Labview is useful for that.

    2. Re:Old languages designed for parallel processing? by coppro · · Score: 4, Informative

      Erlang is probably the best language for servers and similar applications available. Not only is in inherently parallel (though they've only recently actually made the engine multithreaded, as the paralellism is in the software), but it is very easily networked as well. As a result, a well-written Erlang program can only be taken down by simultaneously killing an entire cluster of computers.

      What's more, it has a little-seen feature of being able to handle code upgrades to most any component of the program without ever stopping - it keeps two versions of each module (old and new) in memory, and code can be written to automatically ensure a smooth transition into the new code when the upgrade occurs.

      If I recall correctly, the Swedish telecom where Erlang was designed had one server running it with 7 continuous years uptime.

    3. Re:Old languages designed for parallel processing? by eulernet · · Score: 1

      Ada was created by a french team: http://en.wikipedia.org/wiki/Ada_(programming_language)

      Four teams competed to create a new language suitable for the DoD, and the french team won.

    4. Re:Old languages designed for parallel processing? by Bearhouse · · Score: 1

      Before anyone at DARPA thinks that they can design a better language for concurrent parallel programming then I think they should be forced to spend 1 year learning Ada, and a second year working in Ada. If they survive they will most likely be cured of the thought that the Defense department can design good programming languages

      Well, it's based on Pascal, so whatya expect? Still, does work. (The 777 flight control system is written in it...if it was written in, for example, C or VB, would you get on the 'plane?)

    5. Re:Old languages designed for parallel processing? by Anonymous Coward · · Score: 0

      Funny, all the hacker type people I know that used to badmouth Ada just as you do after having to learn it at university and that had to use it for real afterward ended up loving the language. And that includes the long haired type ;)

      Ada is different enough that the initial encounter is rough. But that pass after using the language for a (significant) while, and then you can appreciate the reasons behind the language design decisions.

      And for the curious, the is a "design rationale" document for each version of Ada. Very interesting read.

    6. Re:Old languages designed for parallel processing? by CrashandDie · · Score: 1

      If it were written in VB, at least you could easily mod the GUI to trace an IP address...

    7. Re:Old languages designed for parallel processing? by Anonymous Coward · · Score: 2, Informative

      The AXD301 which was the first major product to be implemented in Erlang by Ericsson was measured to achieve nine-nines uptime the first couple of years.

      To read more about concurrency in Erlang and uptime have a look at this article:
      http://www.computer.org/portal/pages/dsonline/2007/10/w5tow.xml by Steve Vinovski

    8. Re:Old languages designed for parallel processing? by Beale · · Score: 1

      And you have to love a language where you can in all seriousness say "And now we must initiate... The Ravenscar Profile."

    9. Re:Old languages designed for parallel processing? by Col+Bat+Guano · · Score: 1
      The DoD didn't design Ada, they set a number of requirements then held a competition.

      Several design teams consisting of people from industry and academia worked on it.

      Subsequent updates to the language (protected types are relevant for this topic) were added in Ada95. The parallel part of the language has remained pretty much stable since (interfaces were added, but it hasn't made a big difference to the parallel section).

      What part of parallel programming in Ada don't you like?

    10. Re:Old languages designed for parallel processing? by Anonymous Coward · · Score: 2, Interesting

      If I recall correctly, the Swedish telecom where Erlang was designed had one server running it with 7 continuous years uptime.

      Certainly their ADX301 telephony-over-ATM switch has achieved 'nine nines' uptime - i.e. 99.9999999% up, which allows for ~31ms downtime a year.

      British Telecom use them to power their network and the passed the ultimate test (finals voting for Pop Star, the UK original for American Idol) with flying colours.

      And erlang is a fun language to program.

    11. Re:Old languages designed for parallel processing? by lordholm · · Score: 1

      I worked with Ada for about two years. It is an awful language. Most of the issues that Ada solves are solvable in C by using assert.

      The main problem with the language is that it is to complex and big, this means that it is impossible to keep the language in your head, you have to look-up syntax rules while you are coding, and this even when you have been writing it for several years.

      The only nice thing in the language is the concurrency model which actually simplifies the usage of tasks.

      --
      "Civis Europaeus sum!"
    12. Re:Old languages designed for parallel processing? by Robb · · Score: 1

      I don't think the active/passive synchronisation provided by Ada is the solution for using multiple cores efficiently but since I keep seeing people reinvent the same solution in Java there must be something useful about it. I found Ada quite easy to learn as it is remarkably consistent although that is the result of it being designed by just a couple people working closely together.

    13. Re:Old languages designed for parallel processing? by Robb · · Score: 1

      I worked with Ada for about two years. It is an awful language. Most of the issues that Ada solves are solvable in C by using assert.

      Obviously you were trying to write C in Ada; I agree, that would be truly awful.

    14. Re:Old languages designed for parallel processing? by lordholm · · Score: 1

      Ada gives you:

      1. Bitlayout of records (with a very verbose syntax)
      2. Strict types (i.e. integers with ranges)
      3. Enumerated types
      4. Tasks
      5. Bounded arrays
      6. Generic programming
      7. Modular programming

      Point 2 and 5 are solvable by asserts and by using contracts in your code (seriously, the eiffel way of programming with contracts just feels so much more sane in my eyes than the strict types, especially in a systems programming language).

      4 is really missing in C as I said. 1 and 3 exists in C. Regarding 6, yes that is a bit of a problem with C, but on the other hand in the projects I was involved, generic functions where banned by higher powers. 7 is also possible in C by adhering to discipline. And in most Ada projects, the government (who usually are the contracting authority) mandates discipline.

      So in the end, the reason to use Ada is what?

      --
      "Civis Europaeus sum!"
    15. Re:Old languages designed for parallel processing? by Anonymous Coward · · Score: 0

      Erlang is for concurrency, not parallelism.

      The former is about things genuinely happening at the same time (such as server/client communication, different parts of a UI etc.), the latter is for speeding up a non-concurrent operation by running it on multiple threads. Erlang really offers nothing especially nice for doing that. Threads are usually the wrong abstraction for parallelism, for one thing.

  9. Awful example in the article by theNote · · Score: 3, Insightful

    The example in the article is atrocious.

    Why would you want the withdrawal and balance check to run concurrently?

    1. Re:Awful example in the article by Anonymous Coward · · Score: 2, Funny

      The example in the article is atrocious.

      Why would you want the withdrawal and balance check to run concurrently?

      Because it would make it much easier to profit from self-introduced race conditions and other obscure bugs when I get to the ATM tomorrow :)

    2. Re:Awful example in the article by awol · · Score: 3, Interesting

      The example in the article is atrocious.

      Why would you want the withdrawal and balance check to run concurrently?

      Because I can do a whole lot of "local" withdrawal processing whilst my balance check is off checking the canonical source of balance information. If it's comes back OK then the work I have been doing in parallel is now commitable work and my transaction is done. Perhaps in no more time than either of the balance check or the withdrawal whichever is the longest. Whilst the balance check/withdrawal example may seem ridiculous. There are some very interesting applications of this kind of problem in securities (financial) trading systems where the canonical balances of different instruments would conveniently (and some times mandatorily) stored in different locations and some complex synthetic transactions require access to balances from more than one instrument in order to execute properly.

      It seems to me that most of the interesting parallism problems relate to distributed systems and it is not just a question of N phase commit databases but rather a construct of "end to end" dependencies in your processing chain where the true source of data cannot be accessed from all the nodes in the cluster at the same time from a procedural perspective.

      It is this fact that to me suggests that the answer to these issues is a radical change in language toward the functional or logical types of languages like haskel and prolog with erlang being a very interesting place on that path for right now.

      --
      "The first thing to do when you find yourself in a hole is stop digging."
    3. Re:Awful example in the article by redfood · · Score: 1

      This is just a simple example to show how hard it is to keep data consistent across different processes. A better example would have been to think about a shared bank account account where one person might be withdrawing money and the other might be checking the balance at the same time.

    4. Re:Awful example in the article by grantham · · Score: 1

      Why would you want the withdrawal and balance check to run concurrently?

      Because it might be too much trouble to get both processes to run on the same node. With proper barriers, you can make sure they get run in the correct order.

    5. Re:Awful example in the article by theNote · · Score: 1

      Not to beat a dead horse, but transactions aren't the same thing as parallel processing.

    6. Re:Awful example in the article by theNote · · Score: 1

      Its an ATM.
      You are guaranteed to have no more than one user at a time.
      Ordering in this case shouldn't be hard.

    7. Re:Awful example in the article by redfood · · Score: 1

      No but transactions are easier for the lay public to understand and they introduce many of the same issues. Plus, simultaneous transactions do create processes that are running simultaneously (i.e. in parallel) do they not? I agree there are many issues that come up with parallel programming that transactions won't introduce (how to best parallelize an algorithm, how to prevent deadlock etc) but transactions do describe the problem of maintaining consistent data across processes well.

    8. Re:Awful example in the article by jbengt · · Score: 1

      No,
      it is a bank account.
      It is possible to access it from 2 ATMs at one time.
      Ordering in this case shouldn't be too hard, but it does make an example understandble to the layman.

  10. Re:Parallel programming is dead. No one uses it... by Nursie · · Score: 3, Informative

    Bullshit.

    Tell that to apache, and oracle, and basically anything that runs in a server room.

  11. Re:Parallel programming is dead. No one uses it... by beelsebob · · Score: 3, Insightful

    Threading i don't count as parallel processing for the desktop. I don't even hear of any games or applications built for parallel.

    Uhhhhhhhhhhh? Yes, well done with that...

  12. The type of multithreaded design used is what by Anonymous Coward · · Score: 0

    Look up the term "race condition" here -> http://en.wikipedia.org/wiki/Race_condition , & you'll get an idea of what the "problems inherent" are - I feel the same as you do though, as long as I keep 1 thread doing 1 task, & another thread of execution doing another (albeit, on 2 diff./discrete sets of data, not working on the SAME set of data) - this is known as "coarse multithreading" (keeping multiple threads of execution off the same dataset) vs. "fine multithreading" (see here for more on that -> http://www.cs.bu.edu/~best/courses/cs551/lectures/lecture-02.html ) where the multiple threads of execution work on the same data involved.

    (Bit of an "oversimplification" on my part possibly, but the broad strokes are there - the 'finer points' with examples are on those pages from the URL's above I posted)

    APK

  13. combine with netbook tech by asadodetira · · Score: 1

    I see some potential in combining innovations meant for the netbooks with multiple processors. Low power & lightweight software may mix well with multiple CPUSs.

    1. Re:combine with netbook tech by Narishma · · Score: 1

      Not when Microsoft and Intel are limiting the number of cores on netbooks to 1 for fear of them competing with their more lucrative OSs or CPUs.

      --
      Mada mada dane.
    2. Re:combine with netbook tech by DragonWriter · · Score: 1

      Not when Microsoft and Intel are limiting the number of cores on netbooks to 1 for fear of them competing with their more lucrative OSs or CPUs.

      If someone wanted to put a dual-core Atom N330 into a netbook, how would Intel (much less Microsoft) stop them?

  14. Clojure by slasho81 · · Score: 4, Interesting

    Check out Clojure. The only programming language around that really addresses the issue of programming in a multi-core environment. It's also quite a sweet language besides that.

    1. Re:Clojure by Anonymous Coward · · Score: 0

      Clojure is newer than multicore. How can you say it's the only language around that addresses it? What about Erlang? Scala? Clojure is neat, but it's not the first language of its kind by several decades.

    2. Re:Clojure by slasho81 · · Score: 2, Informative

      Erlang is meant for distributed computation, which is a grand overkill for most programs. See here: http://groups.google.com/group/clojure/msg/2ad59d1c4bb165ff Scala unlike Clojure did not embrace the importance of immutability to concurrency programming, which is why I think it's badly lacking. See here: http://clojure.org/state

    3. Re:Clojure by Cyberax · · Score: 1

      Check Erlang ;)

    4. Re:Clojure by slasho81 · · Score: 1

      Check my reply to the other reply.

    5. Re:Clojure by Cyberax · · Score: 2, Insightful

      Erlang is quite OK for non-distributed programming. Its model of threads exchanging messages is just a natural fit for it. As it is for multicore systems.

    6. Re:Clojure by slasho81 · · Score: 2, Informative
      Here's what Rich Hickey wrote about the matter in http://clojure.org/state

      I chose not to use the Erlang-style actor model for same-process state management in Clojure for several reasons:

      • It is a much more complex programming model, requiring 2-message conversations for the simplest data reads, and forcing the use of blocking message receives, which introduce the potential for deadlock. Programming for the failure modes of distribution means utilizing timeouts etc. It causes a bifurcation of the program protocols, some of which are represented by functions and others by the values of messages.
      • It doesn't let you fully leverage the efficiencies of being in the same process. It is quite possible to efficiently directly share a large immutable data structure between threads, but the actor model forces intervening conversations and, potentially, copying. Reads and writes get serialized and block each other, etc.
      • It reduces your flexibility in modeling - this is a world in which everyone sits in a windowless room and communicates only by mail. Programs are decomposed as piles of blocking switch statements. You can only handle messages you anticipated receiving. Coordinating activities involving multiple actors is very difficult. You can't observe anything without its cooperation/coordination - making ad-hoc reporting or analysis impossible, instead forcing every actor to participate in each protocol.
      • It is often the case that taking something that works well locally and transparently distributing it doesn't work out - the conversation granularity is too chatty or the message payloads are too large or the failure modes change the optimal work partitioning, i.e. transparent distribution isn't transparent and the code has to change anyway.
    7. Re:Clojure by shutdown+-p+now · · Score: 2, Insightful

      Check out Clojure [clojure.org]. The only programming language around that really addresses the issue of programming in a multi-core environment.

      That's a rather bold statement. You do realize that those neat features of Clojure like STM or actors weren't originally invented for it? In fact, you could do most (all?) of that in Haskell before Clojure even appeared.

      On a side note, while STM sounds great in theory for care-free concurrent programming, the performance penalty that comes with it in existing implementations is hefty. It's definitely a prospective area, but it needs more research before the results are consistently usable in production.

    8. Re:Clojure by slasho81 · · Score: 2, Interesting

      That's a rather bold statement. You do realize that those neat features of Clojure like STM or actors weren't originally invented for it? In fact, you could do most (all?) of that in Haskell before Clojure even appeared.

      I do realize that many of the innovations in Clojure are not brand new, but Clojure did put them into a practical form that incorporates many "right" innovations into one language. Haskell is a fine language and one of the languages that heavily influenced Clojure. Clojure makes some paradigms used in Haskell far more usable than they are in their original form.

      On a side note, while STM sounds great in theory for care-free concurrent programming, the performance penalty that comes with it in existing implementations is hefty. It's definitely a prospective area, but it needs more research before the results are consistently usable in production.

      In addition, things like STM is more of a general title for a set of technologies with same general principles but vastly different implementation. Clojure's implementation plus the immutability paradigm Clojure embraces makes its STM darn close to care-free concurrent programming in almost all situations you'll encounter. And I'm well aware that this is an even bolder statement, but I strongly recommend checking it out if you do any kind of concurrent programming. It delivers.

    9. Re:Clojure by johanatan · · Score: 0

      STM has been implemented in Haskell (and costs a 20-30% penalty) but as far as I know, the actor model is from Erlang, not Haskell.

  15. Re:Parallel programming is dead. No one uses it... by MaXintosh · · Score: 1

    Not even all scientists. I use a lot of programs that are computationally intensive - stuff that you start running and walk away for a week - and I'd guess about half of the new programs are not written to take advantage of parallel processing. For us, this is seriously frustrating because... well, I don't like having to wait a week for the software to return a collection of answers, even using powerful machines.

    A big problem for scientific computing - and maybe it's a problem elsewhere, too - is that too many programs are collections of legacy code squashed just-so to make it compilable on a new machine. While I typically applaud the path of least resistance when it comes to work, it makes the software inefficient as heck.

  16. A view based on history... by Anonymous Coward · · Score: 3, Insightful

    Rehash time...

    Parallelism typically falls into two buckets: Data parallel and functional parallel. The first challenge for the general programming public is identifying what is what. The second challenge is synchronizing parallelism in as bug free way as possible while retaining the performance advantage of the parallelism.

    Doing fine-grained parallelism - what the functional crowd is promising, is something that will take a *long* time to become mainstream (Other interesting examples are things like LLVM and K, but they tend to focus more on data parallel). Functional is too abstract for most people to deal with (yes, I understand it is easy for *you*).

    Short term (i.e. ~5 years), the real benefit will be in threaded/parallel frameworks (my app logic can be serial, tasks that my app needs happen in the background).

    Changing industry tool-chains to something entirely new takes many many years. What most likely will happen is transactional memory will make it into some level of hardware, enabling faster parallel constructs, a cool new language will pop up formalizing all of these features. Someone will tear that cool new language apart by removing the rigor and giving it C/C++ style syntax, then the industry will start using it

    1. Re:A view based on history... by NoOneInParticular · · Score: 1

      Functional is too abstract for most people to deal with (yes, I understand it is easy for *you*).

      Well, I do seem to recall that the only language non-programmers actually program in, Excel, is purely functional. They don't seem to have a problem understanding it, but visual basic is way too complicated.

  17. Established vs new programming languages for HPC by Raul654 · · Score: 3, Informative

    This is a subject near and dear to my heart. I got to participate in one of the early X10 alpha tests (my research group was asked to try it out and give feedback to Vivek Sarker's IBM team). Since then, I've worked with lots of other specialized programming HPC programming languages.

    One extremely important aspect of supercomputing, a point that many people fail to grasp, is that application code tends to live a long, long, long time. Far longer than the machines themselves. Rewriting code is simply too expensive and economically inefficient. At Los Alamos National Lab, much of the source code they run are nuclear simulations written Fortran 77 or Fortran 90. Someone might have updated it to use MPI, but otherwise it's the same program. So it's important to bear in mind that those older languages, while not nearly as well suited for parallelism (either for programmer ease-of-use/effeciency, or to allow the compiler to do deep analysis/optimization/scheduling), are going to be around for a long time yet.

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  18. Re:Chapel? by Anonymous Coward · · Score: 0, Troll

    Huh? What spoken language is Perl supposed to be?

    Read the Wikipedia. Not PERL itself, but, from the article:

    Wall and his wife were studying linguistics with the intention afterwards of finding an unwritten language, perhaps in Africa, and creating a writing system for it. They would then use this new writing system to translate various texts into the language, among them the Bible.

    Great, just what we need. More clueless crusaders giving otherwise peaceful peoples the tools to become planet cancer. But there's more:

    Wall's Christian faith has influenced...Perl, such as the name itself, a biblical reference to the "Pearl of great price" (Matthew 13:46). Similar references are the function name bless, and...Perl 6 design documents with categories such as apocalypse and exegesis. Wall has also alluded to his faith when he has spoken at conferences, including a rather straightforward statement of his beliefs...

    I would never work for anybody who used PERL.

  19. Re:Parallel programming is dead. No one uses it... by quanticle · · Score: 2, Insightful

    If threading isn't parallelism, then what is? At what level of separation between separate streams of execution does an application become "parallel"?

    --
    We all know what to do, but we don't know how to get re-elected once we have done it
  20. It has to be easy by DarrylKegger · · Score: 1

    I think it's just too darned difficult for most of us to write mainstream apps whilst figuring out the how to take advantage of multiple cores, or figuring out how to write a web browser in erlang etc. I'll be curious to see what comes out of the stanford pervasive parallelism lab, I think they have the right approach to tackling the 'coding for many cores' problem. http://www.theregister.co.uk/2008/04/30/stanford_funding_ppl/

    1. Re:It has to be easy by rbmyers · · Score: 1

      If c++ caught on, anything can catch on. Well, maybe not ada.

  21. Ok, I realize this is reaching... by feepness · · Score: 1

    But my first thought upon reading "Chapel" was...

    I'm multi-threaded bitch!

  22. Bad idea by junglebeast · · Score: 0, Redundant

    Creating threads is extremely easy to do in Java, .NET and C++. We don't need "inherit" parallelism in every loop where the performance gains are negligible. The programmer knows best. The programmer can synchronize more efficiently and divide tasks more efficiently than any automated "parallelizer" ever will, and having such automated retarded languages will only reduce overall performance when you have lots of applications running, all of which might as well have been sequential, but are now all fighting for multiple cores.

    1. Re:Bad idea by JohnFluxx · · Score: 1

      Lol, what? Maybe you are thinking about just 2 or 3 cores, but what about when you have 8/9 cores like the PS3? Or a hundred cores like some of the experimental chips just coming out? I think you're seriously overestimating the ability of the average programmer if you think the programmer knows best how to maximise performance when you have dozens of cores.

  23. Meh. DARPA barking up the wrong tree. by ancient_kings · · Score: 1, Troll

    Of the billions of lines of code that runs on most of the world's fastest supercomputers, 99% of it is in FORTRAN. This will NEVER CHANGE. PERIOD. Anybody who tries to change this, should be shown the door. Granted, most if it is still Fortran77, but it works, runs the fastests and the easist to maintain. This is why the next generation of Fortran (Fortran 2008) will be hard-core, parallel driven. You "C" beanies will need C/OPENMPI/OPENMP/TAU to just even try to match Fortran2008's power. Intel knows this, this is why all their parallel libraries, codes and tools are geared for Fortran and NOT C. Live with it...

    1. Re:Meh. DARPA barking up the wrong tree. by TeknoHog · · Score: 1

      AFAIK, Fortran has been a parallel language since '90. Meaning data parallel, in that you can write vector/matrix operations that are explicitly parallel, so the compiler can split these up. So what's new in Fortran 2008 in the parallel sense?

      --
      Escher was the first MC and Giger invented the HR department.
    2. Re:Meh. DARPA barking up the wrong tree. by Beale · · Score: 1

      Anybody who tries to change this, should be shown the door.

      Will, but possibly not should.

      <quote>You "C" beanies will need C/OPENMPI/OPENMP/TAU to just even try to match Fortran2008's power. Intel knows this, this is why all their parallel libraries, codes and tools are geared for Fortran and NOT C. Live with it...</quote>

      Last time I looked, MKL and the Intel compiler were much more focused towards C. Maybe you're thinking of IBM?

    3. Re:Meh. DARPA barking up the wrong tree. by NewbieProgrammerMan · · Score: 1

      Of the billions of lines of code that runs on most of the world's fastest supercomputers, 99% of it is in FORTRAN. This will NEVER CHANGE. PERIOD. Anybody who tries to change this, should be shown the door. Granted, most if it is still Fortran77, but it works, runs the fastests and the easist to maintain.

      That's all well and good for existing applications, but if somebody comes along with a new language and tools that let me write a new number-crunching app with a lot less effort than the current set of languages, then all the Fortran/C/etc. fans can go fly a kite.

      Yeah, maybe Chapel/X10/Fortress won't be massively successful, maybe they won't knock Fortran off the roost, but if they're introducing new ideas and trying to make parallel coding a little easier and a little more foolproof for us not-so-uber coders, then I'm all for it. Something tells me the Fortran standards group will gladly assimilate any cool ideas that come out of DARPA's research effort.

      --
      [b.belong('us') for b in bases if b.owner() == 'you']
  24. Re:Chapel? by sam_handelman · · Score: 3, Funny

    You know what, I don't think I'm going to use modern English, either.

      Don't you know that early modern English was invented to have something standard into which the bible could be translated? For shame!

      As a devoted secularist, I'll just burn all my shakespeare and rushdie after I delete all my perl code.

    --
    The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
  25. Re:Parallel programming is dead. No one uses it... by Daniel+Dvorkin · · Score: 1

    A big problem for scientific computing - and maybe it's a problem elsewhere, too - is that too many programs are collections of legacy code squashed just-so to make it compilable on a new machine.

    It's not solely a problem with scientific computing by any means, but as a bioinformatics grad student who used to be a business programmer, I think I can say it's worse in scientific computing than in most other places. As a rule, when you design a business app, you design it for longevity. The company wants something that will not just work for the moment, but work for years, and preferably be maintainable after the person who wrote it leaves the company. Most scientific programs, in contrast, are written to solve a specific problem for a specific project in a specific lab; and the people writing them are most often grad students or postdocs who are pretty much guaranteed to leave at some point, but this isn't taken into account when the programs are being written. (At three in the morning, to meet the deadline for a paper submission or conference presentation the next day ...) If they live on after that, it's pretty much by accident.

    There's a reason for this. Most scientists are not computer scientists, and they don't want to be. They've had neither a rigorous education in best practices, nor the production-coding experience to see those practices put into effect. They were concentrating on learning other things -- the core knowledge of their field -- and that knowledge is vital to their research. No one, unfortunately, has the time to learn it all.

    I'm not sure what the answer is. I'm a better programmer than most of the people I work with (he says, trying not to break his arm patting himself on the back) but I'm also forty years old and still a year or two away from finishing my PhD. I'm glad to have the experience that's useful in my work, but I kind of doubt that my fellow students (or faculty!) would trade away their youth for that experience. Ultimately I think the answer comes down to having a mix of people in the lab: at least a few who have business programming and/or formal CS education in their backgrounds, working alongside those who have focused their entire careers on the science itself. That way everyone can learn from each other.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  26. Beg to differ on a small point by Anonymous Coward · · Score: 0

    "MT apps on a single CPU core can have benefits-- such as, your UI can remain responsive to the user during serious number crunching-- but at 100% CPU load, this necessarily comes with the cost of your number-crunching taking longer - by synaptik (125) on Sunday June 07, @03:23PM (#28243359) Homepage

    Actually, I have heard tell that multithreaded applications (not specifically SMP designed ones that use Win32 API calls like SetThreadAffinity etc. et al) actually have MORE overheads than single-threaded code, & are therefore actually slower on single-core/single CPU machines, than multithread designed code are on SMP/MultiCore systems.

    AND??

    As far as the UI being responsive? Yes, I suppose you could use a thread to update/refresh (using form objects methods such as Application.MainForm.Update OR Application.MainForm.Refresh as examples from Borland Delphi), but, then again/alternately?

    You can use Application idletime to do that also (Borland Delphi, as just 1 example language I am using here, once more), as Delphi provides an "Application.OnIdle" event you can leverage for that type of work, & it does the job well (I have used it for this before is why I mention it now)...

    Using application or systemwide idletime, you can safely perform any screen/ui updates therein, foregoing the use of a thread there (the ONLY reason I mention this, & SPECIFICALLY with Delphi, though it may also be true of other programming languages, is that the Borland GUI RAD controls you're given are NOT "guaranteed" to be 100% thread-safe is why I note this point).

    APK

    1. Re:Beg to differ on a small point by Anonymous Coward · · Score: 0

      You can use Application idletime to do that also

      So while the application isn't idle (eg crunching large datasets) the UI is unresponsive. Gotcha, that really solves the issue.

  27. Re:Parallel programming is dead. No one uses it... by Cyberax · · Score: 1

    Can Apache use 2 CPUs to serve one request? Well, I thought so.

    Oracle, AFAIK, can't use several CPUs to service one query too (I might be wrong).

  28. Re:Parallel programming is dead. No one uses it... by sam_handelman · · Score: 2, Interesting

    Parent is kinda flamebait, and it's exactly the opposite of my experience.

      Scientists (I am one) who also write some of their own code, have much better things to do with our time than to try and make the software efficient. When we figure out what we want done, we hand it over to professional programmers who, if the cost:benefit analysis works out, will parallelize or optimize it as they're told is needed. Even lousy programmers are expensive, and hardware is cheap.

      I 100% agree with the end of his statement - was it 10, 15 years ago scientific computing was still done in fortran FOR A REASON - the optimizing compiler didn't completely suck? Some scientific computing is still done in FORTRAN but that's been purely a legacy thing since the optimizing compilers for C caught up. I'm sure someone clever will find some way to get an interpreted language to figure out what depends on what and parallelize your code for you. This is a very hard problem to do perfectly, but sensible people will quickly realize that's okay. For some cases, I can beat an optimizing compiler by writing assembly - am I ever going to do that? Hell no.

      Now, this may result in additional good coding practices which will be required of us so that the optimizing compiler can make easier sense of our code. Might it be lower overhead to create an optimization friendly programming language, which I suspect will end up amounting to making such practices an explicit requirement? Probably not, but it depends on how closely these new programming languages adhere to existing languages (I haven't looked at either example discussed in the article.)

    --
    The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
  29. Re:Parallel programming is dead. No one uses it... by hedwards · · Score: 1

    That's funny, because Fallout 3 for instance is enhanced for multicore. I'd assume there were other games.

    I'd be skeptical of allowing smart compilers to do the programming for us, a compiler no matter how smart isn't the programmer.

  30. Re:Parallel programming is dead. No one uses it... by MillionthMonkey · · Score: 1

    except scientists who use supercomputers

    I don't really consider myself a scientist but I've had to write parallel systems twice now. It goes much better when you either don't need internode communication, or you're allowed to implement it that way (when nobody's around to complain they might have to configure something).

    Threading I agree is not the same thing... with a decent threading API available you can fan out a loop across all your CPU cores with a small paragraph of code, if the iterations are independent of one other. Setting up that independence (if you have to) is most of the work. But that's a trick you can also use in a UI to speed up animations etc.

    Of course there are standard libraries now to handle this stuff, so some of the fun is gone.

  31. The mess by Animats · · Score: 4, Interesting

    I've been very disappointed in parallel programming support. The C/C++ community has a major blind spot in this area - they think parallelism is an operating system feature, not a language issue. As a result, C and C++ provide no assistance in keeping track of what locks what. Hence race conditions. In Java, the problem was at least thought about, but "synchronized" didn't work out as well as expected. Microsoft Research people have done some good work in this area, and some of it made it into C#, but they have too much legacy to deal with.

    At the OS level, in most operating systems, the message passing primitives suck. The usual approach in the UNIX/Linux world is to put marshalling on top of byte streams on top of sockets. Stuff like XML and CORBA, with huge overhead. The situation sucks so bad that people think JSON is a step forward.

    What you usually want is a subroutine call; what the OS usually gives you is an I/O operation. There are better and faster message passing primitives (see MsgSend/MsgReceive in QNX), but they've never achieved any traction in the UNIX/Linux world. Nobody uses System V IPC, a mediocre idea from the 1980s. For that matter, there are still applications being written using lock files.

    Erlang is one of the few parallel languages actually used to implement large industrial applications.

    1. Re:The mess by OneSmartFellow · · Score: 1

      Please keep up. Granted the next C++ standard is seemingly mired in bureaucracy, but at least it has addressed threading for a decade. It would be great if the powers that be would finally call it quits and finalize what we have, then we could move on to the next issue, automatic failover.

    2. Re:The mess by lennier · · Score: 1

      Threading isn't any kind of solution. Threading is the problem. Languages like C++ which think threading can be made safe at any speed, are also the problem.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    3. Re:The mess by Anonymous Coward · · Score: 0

      > As a result, C and C++ provide no assistance in keeping track of what locks what.

      Actually, that is a design feature of C/C++. C (and in most cases C++) was designed to be only a step or two above assembler level programming, and as a result you never pay for what you don't use. If you want locks in your code, then YOU manage them. Libraries (such as pthreads) can take care of the multiprocessing part of your application if you want, but you never have to pay for the overhead of locking mechanisms if you don't need them.

      As a result, you have FAR more leeway when designing/writing your code for it to work the way you want to.

    4. Re:The mess by Anonymous Coward · · Score: 0

      Sigh .. my last perl program used JSON messages over System V IPC (msgsnd,msgrcv). And here I was feeling proud of it .. sigh

    5. Re:The mess by Animats · · Score: 2, Interesting

      Sigh .. my last perl program used JSON messages over System V IPC (msgsnd,msgrcv). And here I was feeling proud of it

      I know, I know. I have an application in production which uses Python "pickle" over pipes to subprocesses.

      Incidentally, it's interesting to speculate what the UNIX/Linux world might have been like if, when a process exited, it was able to return a result list, like the parameter list that goes in. Shell scripts, and "make", might not have been so blind to what the subprogram actually did.

    6. Re:The mess by Anonymous Coward · · Score: 0

      Just a quick comment. System V IPC mechanisms include shared memory, semaphores, and message queues. These are used all the time in Unix programming.

      I've been very disappointed in parallel programming support. The C/C++ community has a major blind spot in this area - they think parallelism is an operating system feature, not a language issue. As a result, C and C++ provide no assistance in keeping track of what locks what. Hence race conditions. In Java, the problem was at least thought about, but "synchronized" didn't work out as well as expected. Microsoft Research people have done some good work in this area, and some of it made it into C#, but they have too much legacy to deal with.

      At the OS level, in most operating systems, the message passing primitives suck. The usual approach in the UNIX/Linux world is to put marshalling on top of byte streams on top of sockets. Stuff like XML and CORBA, with huge overhead. The situation sucks so bad that people think JSON is a step forward.

      What you usually want is a subroutine call; what the OS usually gives you is an I/O operation. There are better and faster message passing primitives (see MsgSend/MsgReceive in QNX), but they've never achieved any traction in the UNIX/Linux world. Nobody uses System V IPC, a mediocre idea from the 1980s. For that matter, there are still applications being written using lock files.

      Erlang is one of the few parallel languages actually used to implement large industrial applications.

  32. Re:Parallel programming is dead. No one uses it... by moon3 · · Score: 1

    Network stack doesn't run in parallel therefore there is no need to use multiple CPUs to service network requests. Even worse, multi-tasked servers perform much worse then single threaded servers. The bottleneck is the network end not the CPU anyway.

  33. Art of Parallel Programmin by rayharris · · Score: 2, Interesting

    There needs to be an equivalent of Donald Knuth's "Art of Computer Programming" as a definitive reference for parallel algorithms. Until then, I don't care how many cores you have, you won't get the most out of them.

    --
    I void warranties.
    1. Re:Art of Parallel Programmin by Xiver · · Score: 1

      I would love to see Knuth's works updating for parallel algorithms. I agree with you too.

      --
      10: PRINT "Everything old is new again."
      20: GOTO 10
  34. How to Solve the Parallel Programming Crisis by Louis+Savain · · Score: 2, Interesting

    Exactly, some problems are inherently serial. These programs would run slower if you made them run in parallel.

    If they are inherently sequential, then obviously they cannot be made to run in parallel. The truth is that the vast majority of computing applications, both existing and future, are inherently parallel. As soon as some maverick startup (forget the big players like Intel, Microsoft, or AMD because they are too married to the old ways) figures out the solution to the parallel programming crisis (see link below), get ready for a flood of super complex parallel applications to hit the market, especially in the AI, gaming and simulation fields. Cars will drive themselves and robots will maintain your home, that kind of stuff. The possibilities are mind boggling.

    Now the reason that the old timers cannot solve the problem is that they are all addicted to the Turing Machine model of computing and last century's multithreaded approach to concurrency. The Turing Machine model is evidently no help in solving the crisis and threads are inherently non-deterministic. There is an urgent need to move away from antiquated and flawed paradigms that do not contribute to the solution. Indeed, they got us into this mess to begin with.

    How to Solve the Parallel Programming Crisis

    1. Re:How to Solve the Parallel Programming Crisis by MikeBabcock · · Score: 1

      One of the reasons I like the Cell processor design so much is because you can run software in true parallel style on the SPUs and simply put the results together periodically with the main CPU core. Its not perfect for sure, and the limited local memory on each SPU can be limiting but it certainly is a lot different from how we normally handle multi-core computing.

      Looking at problems like the calculations for Folding@Home is an interesting study in massive parallelism.

      --
      - Michael T. Babcock (Yes, I blog)
    2. Re:How to Solve the Parallel Programming Crisis by hitmark · · Score: 1

      I'm tempted to call the CELL a collection of FPGA, as each SPU can in theory be given a single step in a chain and just keep doing it over and over...

      And was there not talk about having some kind of way to wire up multiple CELL powered devices so that they could act as a single large computer?

      --
      comment first, facts later. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
    3. Re:How to Solve the Parallel Programming Crisis by SnowZero · · Score: 1

      Your proposal sounds no different from the Dataflow architecture that people have worked on for decades. While they are appealing for their elegance, the Wikipedia article sums up the problems succinctly:

      The research, however, never overcame the problems related to:

      • efficiently broadcasting data tokens in a massively parallel system
      • efficiently dispatching instruction tokens in a massively parallel system
      • building content addressable memories large enough to hold all of the dependencies of a real program

      Although you reference "data flow" in some of the the comments on your blog, you don't seem to be referring to "dataflow architectures" since they very much have "signal flow" as you call it, in addition to data. GPUs are not really dataflow at all, and are more correctly described as variants of streaming or vector processors.

      By all means, continue with your work if you think it is promising, but you should reference prior work, and show how you can overcome the limitations that the prior work got hung up on. In dataflow, it wasn't the theoretical design that caused problems, but instead practical implementation of the chips. There are enough free resources out there for doing simulated chip design that you could design a simple prototype. Until then, don't be surprised if people are skeptical.

    4. Re:How to Solve the Parallel Programming Crisis by Louis+Savain · · Score: 4, Insightful

      Sorry but, IMO, the cell is a perfect example of how not to design a multicore processor. Heterogenous processors introduce nothing new to the table of solutions that was not already there. We had systems with CPUs and GPUs before the Cell (or Intel's Larrabee and AMD's Fusion) showed up. Everybody knows that they're a pain in the ass to program. Neither CUDA nor OpenCL nor Microsoft's much ballyhooed TBB (threaded building blocks) will change that fact.

      My point is that one does not design a parallel processor and then come up with a programming model to exploit it. It should be the other way around. The programming model should come first. One should design a model that makes parallel programming easy and the resulting apps rock-solid. Only then, after you have perfected your model, should you even consider designing a processor to support the model.

      IOW, everybody's doing it wrong, and by everybody, I mean the all big players in the multicore hardware/software industry: Intel, Microsoft, IBM, Sun-Oracle, AMD, ARM, Apple, FreeScale, etc. The major computer science centers who are getting a lot of research money form the industry are not helping either since they have to kowtow to the likes of Intel and AMD whose main interest is to safeguard their installed base and preserve continuity.

      It makes no difference. When the pain becomes unbearable (it's all about money), it will suddenly dawn on everybody that what is needed is to break away from the past.

    5. Re:How to Solve the Parallel Programming Crisis by Louis+Savain · · Score: 2, Interesting

      I am not sure why you make your assumptions since the solution that I am proposing emphasizes intructions and timing more than anything else. Data is just the environment where the program effects changes and reacts to changes. If my solution is dataflow then so is a pulsed neural network that consists of connected sensors and effectors. I have never heard neural network programmers refer to their programs as dataflow programs. Yet, this is what I am proposing: a program should be more like a signal-driven neural network.

      AFAIK, dataflow systems do not concern themsleves with a program counter. One of the most important aspects of the solution that I am advancing is that determinism is essential to reliable software. There should never be any ambiguity as to whether any two events/operations in a program are sequential or parallel. This requires a program counter to mark time. Another important aspect is that a program should be 100% reactive, i.e., everything should happen in reaction to a change/event.

    6. Re:How to Solve the Parallel Programming Crisis by Sehnsucht · · Score: 1
    7. Re:How to Solve the Parallel Programming Crisis by CastrTroy · · Score: 1

      Maybe because parallel programming is just inherently hard. We've had simulated parallel programming for probably at least 15 years, using threads, or even multiple processes, and nobody has found a good model that actually makes it easy to do parallel programming. I took a course in university, and although I did quite well, I have to say that parallel programming requires you to relearn a lot of what you have learned, and requires a very large paradigm shift. We have enough programmers who have trouble with sequential programming, we don't need to make it any harder for them.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    8. Re:How to Solve the Parallel Programming Crisis by Anonymous Coward · · Score: 0

      > Now the reason that the old timers cannot solve the problem is that they are all addicted to the Turing Machine model of computing and last century's multithreaded approach to concurrency. The Turing Machine model is evidently no help in solving the crisis and threads are inherently non-deterministic.

      You have learned to spell the fancy words without completely grasping what the fancy words mean. It is the Turing machine that is the physical reality of modern hardware, and the threads are the model the Turing machines simulate. You've essentially got your criticism backwards; if our current methods of parallel processing aren't working for us, then it is the thread model that is failing us. And behold: that is the whole point of designing new languages to parallelize our computations for us - so that they will do so in a sane way instead of human programmers handling threads and locking manually.

    9. Re:How to Solve the Parallel Programming Crisis by grumbel · · Score: 2, Insightful

      using threads, or even multiple processes, and nobody has found a good model that actually makes it easy to do parallel programming.

      The reason is that threads and processes are the wrong tool for the task, they introduce a lot of additional complexity while still failing at giving you fine grained parallelization. They are patchworks that try to add the ability to do parallelization to languages that where build for sequential evaluation, instead of solving the problem from the ground up.

      Functional languages look like a much better solution for parallel programming. Without side effects, a very large part of parallel programming problems disappears instantly. There will be a need for some relearning as functional languages require a different approach to tackle a programming problem, but in the end that is what it needs. Look at GPU programming, the reason why it is trivial to apply parallelization to it, is because the infrastructure forces you to write your program in a way that can be parallelized. You can't make your fragment shader go crazy and draw over other pixels and stuff, you have one pixel that you have to fill and no side-effects outside of that and as a result you can throw as much parallel processing power at it as you like.

  35. Re:Parallel programming is dead. No one uses it... by Pinky's+Brain · · Score: 1

    Why would you even try that? For relevant workloads there are more requests than CPUs, handling multiple requests in parallel is inherently more efficient than trying to parallelize a single request. As for Oracle, the query level parallelization is pretty primitive ... but a cursory google search shows it exists http://www.orafaq.com/wiki/Parallel_Query_FAQ

  36. Re:Parallel programming is dead. No one uses it... by peragrin · · Score: 1

    Two points? If that request access both web and a database then yes.

    Second why would you want to CPu's to do one task? Why not have each CPU handle their own request and handle twice as many users.

    --
    i thought once I was found, but it was only a dream.
  37. Re:Parallel programming is dead. No one uses it... by Anonymous Coward · · Score: 0

    Threading is parallelism, if it's not then what is?

    Also, Team Fortress 2 has multiprocessor extensions.

  38. Re:Parallel programming is dead. No one uses it... by Cyberax · · Score: 1

    I'm trying to tell that servers are a special case. They are trivially parallelizeable.

    We still have not solved more complex parallelization tasks. For example, GUI is still mostly single-threaded.

  39. Split-C and Titainium + MPI by c0d3r · · Score: 1

    I remember back in the 90's uc berkeley had developed split-c and titanium.

    Split-c was interesting in it was C based and had some interesting concepts such as running block on one or many processors, synchronizing processors and spread pointers (pointers across memory across machines).

    Titainium was a Java like language for parallel processing, but at the time didn't have multithreaded implemented.

    MPI seemed to be the main api used on standard languages.

  40. Re:Established vs new programming languages for HP by Pinky's+Brain · · Score: 1

    Being on the inside could you perhaps explain to me why they went with threading instead of message passing?

  41. Re:Parallel programming is dead. No one uses it... by Vanders · · Score: 1

    Threading i don't count as parallel processing for the desktop.

    Multiple threads on a single CPU may not be parallel, but the moment you add more than one core, of course it is parallel.

  42. Re:Parallel programming is dead. No one uses it... by Anonymous Coward · · Score: 0

    Ever heard of pipe-lining?

  43. Chapel by jbolden · · Score: 2, Interesting

    Looking at the 99 bottles Chapel code (from original article)
    http://99-bottles-of-beer.net/language-chapel-1215.html

    This looks like the way you do stuff in Haskell. Functions compute the data and the I/O routine is moved into a "monad" where you need to sequence. This doesn't seem outside the realm of the possible.

  44. Multi-threaded or Parallel? by ipoverscsi · · Score: 5, Insightful

    I have not read the article (par for the course here) but I think there is probably some confusion among the commenters regarding the difference between multi-threading programs and parallel algorithms. Database servers, asynchronous I/O, background tasks and web servers are all examples of multi-threaded applications, where each thread can run independently of every other thread with locks protecting access to shared objects. This is different from (and probably simpler than) parallel programs. Map-reduce is a great example of a parallel distributed algorithm, but it is only one parallel computing model: Multiple Instruction / Multiple Data (MIMD). Single Instruction / Multiple Data (SIMD) algorithms implemented on super-computers like Cray (more of a vector machine, but it's close enough to SIMD) and MasPar systems require different and far more complex algorithms. In addition, purpose-built supercomputers may have additional restrictions on their memory accesses, such as whether multiple CPUs can concurrently read or write from memory.

    Of course, the Cray and Maspar systems are purpose-built machines, and, much like special-build processors have fallen in performance to general purpose CPUs, Cray and Maspar systems have fallen into disuse and virtual obscurity; therefore, one might argue that SIMD-type systems and their associated algorithms should be discounted. But, there is a large class of problems -- particularly sorting algorithms -- well suited to SIMD algorithms, so perhaps we shouldn't be so quick to dismiss them.

    There is a book called An Introduction to Parallel Algorithms by Joseph JaJa (http://www.amazon.com/Introduction-Parallel-Algorithms-Joseph-JaJa/dp/0201548569) that shows some of the complexities of developing truly parallel algorithms.

    (Disclaimer: I own a copy of that book but otherwise have no financial interests in it.)

    1. Re:Multi-threaded or Parallel? by ceoyoyo · · Score: 1

      Part of the problem is probably insisting on calling it "threaded" and "parallel" when you actually mean "embarrassingly parallel" and "more tightly coupled" or somesuch.

      The use or non-use of threads in any of their forms has nothing to do with whether you're talking about a very easily parallelized task, such as serving many web pages at the same time, or a more difficult-to-parallelize task such as searching.

  45. Re:Parallel programming is dead. No one uses it... by Mad+Merlin · · Score: 1

    For example, GUI is still mostly single-threaded.

    ...and not CPU bound!

  46. Re:Parallel programming is dead. No one uses it... by Beale · · Score: 1

    Er, what do you mean by that? Most good apps now keep a thread just for the interface, and new GUIs ship the compositing out to the graphics card, which contains vector processors to perform data parallel operations.

    That's at least three levels of parallelism, as I see it.

  47. Re:Parallel programming is dead. No one uses it... by Beale · · Score: 1

    My experience agrees with yours -- scientific parallel computing seems to be a lot like a battle between the scientists, who want to write quick, sloppy, easy code that just runs, and the parallel computing people who want them to write better code that will run with greater efficiency and in less time.

    When the project was put to IBM, Cray and Sun to design these new languages (Sun's Fortress didn't make the cut), one of the specifications was that they should have parallelism not as a quite hard to use bolt-on, like MPI, but that the parallelism should be intrinsic, built-in, and transparent where possible. (As far as I recall.)

    I really like the UPC and Coarray Fortran models, which give parallel features that -look- intrinsic: UPC looks like C but with shared type variables (eg. shared int shared_array[4]). The PGAS model works well for this sort of thing, and doesn't scale too badly if you know enough to work out roughly how it works. Fortunately, IBM have picked up UPC and look to be interested in implementing a native compiler for it (alphaWorks has a shared memory parallel version) as well as the Berkeley UPC source-to-source compiler, and a couple of people are implementing Coarray Fortran modules for Fortran compilers.

    There are some good people working on these new parallel methods -- and working on bringing them at least part-way to the programmers, rather than making the programmers come to them.

  48. Re:Established vs new programming languages for HP by Raul654 · · Score: 2, Informative

    I think you're confusing two different uses of parallelism. One is "small" parallelism -- the kind you see in graphical user interfaces. That is to say, if Firefox is busy loading a page, you can still click on the menus and get a response. Different aspects of the GUI are handled by different threads, so the program is responsive instead of hanging. That's done by using threading libraries like Posix and the like. But that's really a negligible application of parallelism. The really important use of parallelism is for really large programs that require lots of hardware to run in a reasonable amount of time.

    The threading libraries that you use for GUI applications don't work well for computationally intensive applications requiring parallelism. They require a shared-memory architecture. (A shared memory architecture is one in which all processors see the same values in the RAM. E.g, if processor #1 writes X to memory block 0x87654321, and processor #2 reads 0x87654321, it returns X instead of whatever value processor #2 last wrote there) Shared-memory architectures don't scale -- the biggest ones you can buy have about 64 CPUs. If you do want to run computationally intensive applications on shared memory architectures, then OpenMP is the library of choice. It's also fairly simple to use.

    If you want to run a big applications, you need to use a distributed memory architecture. And MPI (message passing interface) is pretty much the only game in town where that is concerned. It's by far the dominant player.

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  49. Re:Parallel programming is dead. No one uses it... by Anonymous Coward · · Score: 0

    You are absolutely correct. Oracle does not do true parallel processing. As a result of this, companies like Mark Logic, that specialize in XML databases utilizing XQuery are able to beat Oracle in performance for large server farms. This is due to the fact that XQuery operates in a truly functional manner, like Erlang, allowing for peak parallellization.

  50. Re:Parallel programming is dead. No one uses it... by MikeBabcock · · Score: 1

    Multi-threaded networking isn't new. Most requests are networked. Multi-threaded data sorting and searching isn't new either. Multi-threaded disk handling isn't either.

    Whether these tasks have historically been made multi-threaded or not has more to do with the lack of threading support on the CPUs themselves IMHO.

    --
    - Michael T. Babcock (Yes, I blog)
  51. Re:Chapel? by Raffaello · · Score: 3, Insightful

    No widely spoken human natural language was "invented," Modern English included. Where do people come up with these things? Modern English evolved out of Middle English, just as Spanish evolved out of Latin, etc. Modern English was not "invented" in any meaningful sense of "invernted."

    reference for WikiWeenies.

  52. Re:Established vs new programming languages for HP by Anonymous Coward · · Score: 0

    Shared-memory architectures don't scale -- the biggest ones you can buy have about 64 CPUs

    The shared memory solutions by SGI and Cray provide up to 8024 processors (Cray XMT).

  53. Re:Established vs new programming languages for HP by Pinky's+Brain · · Score: 1

    I use threading as short hand for communication through shared memory with manual locking. PGAS to me seems firmly targeted at the shared memory with manual locking type of programming, the G in PGAS isn't really necessary for message passing

    PS. I realize that up to a certain point just accessing remote memory is more efficient than passing on all the data in a message, but those kind of optimizations can be done transparently with pure message passing too (with MOBILE data types in modern Occam for instance). So a completely global memory model is not necessary.

  54. Re:Established vs new programming languages for HP by Beale · · Score: 1

    As far as I'm aware, those larger systems are ccNUMA -- cache-coherent non-uniform memory access, in which processors have local memory, but can also access other processors' memory via a network layer. It doesn't quite support the same kind of scaling as proper, uniform access shared memory.

  55. Mod parent up by jpmorgan · · Score: 1

    A lot of commenters don't seem to understand this distinction. Personally, I like to use the terms concurrent program and parallel program to distinguish.

    There is a lot of concurrent software out there... the often cited examples of web servers, database engines, UI threads and the like are concurrent. There is parallel execution going on, but fundamentally each thread is working on a separate task and the problem that needs to be solved is making sure that the threads don't step on each other's toes (deadlocks, priority inversions, etc.). This is tough, but it's simpler than parallel programming.

    In a truly parallel program, you have parallel streams of execution solving one problem. This is two orders of magnitude harder, since most of the algorithms we use regularly aren't easily parallelizable. I do this as my day job; it's two orders of magnitude harder since for every day we spend working on the concurrency problems, we spend a hundred working on the parallelization problems.

  56. Re:Parallel programming is dead. No one uses it... by Parallax48 · · Score: 1

    It is easy to create a single worker thread that does some offline processing, say creating a document to print. It is harder to split a task into n threads where n scales over time (years after your program was shipped) to match the number of available cores in the machine. Harder still to ensure that all these threads are kept busy with low synchronisation overhead.

    I argue that a program gets more "Parallel" as it becomes more scalable in the sense I just described.

  57. Re:Parallel programming is dead. No one uses it... by Parallax48 · · Score: 1

    I like to think of all programming languages as a way to translate our requirements into a highly explicit form that the computer can understand. We then choose how explicit we want to be (assembly -> C -> Python (say) -> ?)

  58. Re:Parallel programming is dead. No one uses it... by slashtivus · · Score: 1

    While the GPU will certainly parrellelize the drawing operations with its specialty hardware, the calling thread is going to have to wait / block for that to complete before doing a refresh.

    The blocking portion of that goes back to being a serial operation (waiting for something else to complete), the software is just using the appropriate hardware for the task at hand. I'm don't think that really qualifies as "parrallel" in this case.

    Sorry to be nit-picky.

  59. Re:Parallel programming is dead. No one uses it... by Beale · · Score: 1

    But other threads can do processing on the CPU while that's taking place.

  60. Re:Parallel programming is dead. No one uses it... by wirelessbuzzers · · Score: 1

    Threading isn't parallelism, it's concurrency. They aren't the same thing.

    A concurrent program is logically divided into several interacting threads. Concurrency is about the program's semantics, not its performance.

    A parallel program is physically divided into several (not necessarily interacting) operations (usually threads, but might be vector ops) which run at the same time on different pieces of hardware. Parallelism is about the program's performance, not its semantics.

    A web browser should be concurrent in that the network thread shouldn't block the Javascript thread shouldn't block the interface. It's less important that it be parallel, though this will improve performance and probably responsiveness when dealing with Javascript-intensive sites and flash movies.

    A web server should usually be parallel for better performance. it's not particularly concurrent because its threads don't interact much.

    It happens that writing a program concurrently makes it easier to parallelize, but it doesn't guarantee that it will be run in parallel. For example, Concurrent ML performs pretty well despite not supporting SMP.

    --
    I hereby place the above post in the public domain.
  61. Re:Chapel? by Eskarel · · Score: 1

    Mandarin was invented, quite cleverly too(at least the written form).

  62. Re:Chapel? by sam_handelman · · Score: 1

    Did you read the parent? Perl was not invented from whole cloth either, it was based on awk and C. Even most constructed spoken languages, Esperanto for example was based on modern derivatives of Latin. Was Esperanto not therefore "invented"?

    --
    The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
  63. Re:Parallel programming is dead. No one uses it... by ceoyoyo · · Score: 1

    Threading i don't count as parallel processing for the desktop.

    There are no vehicles that carry more than two people. Anything with more than two wheels I don't count as a vehicle.

  64. Re:Parallel programming is dead. No one uses it... by Johnno74 · · Score: 1

    SQL Server most definitely can use multiple CPUs to process a single query. For nasty report queries etc it can speed things up significantly.

    The problem is there is a large amount of cost in splitting the query into multiple parallel tasks and then merging the results, and for a simple query this outweighs the gains made. SQL Server's default config can be a bit trigger-happy in deciding to do a parallel query, for OLTP applications I normally disable it.

  65. Re:Chapel? by Anonymous Coward · · Score: 0

    So the bless keyword had to be a biblical reference, it couldn't have been in the form of an augmented constructor without biblical literary reference?

    HORSESHIT, I say.

  66. it's getting easier, not harder by drfireman · · Score: 2, Interesting

    Recent versions of gcc support OpenMP, and there's now experimental support for a multithreading library that I gather is going to be in the next c++ standard. These don't solve everyone's problems, but certainly it's getting easier, not harder, to take better advantage of multi-processor multi-core systems. I recently test retrofit some of my own code with OpenMP, and it was ridiculously easy. Five years ago it would have been a much more irritating process. I realize not everyone develops in c/c++, nor does everyone use a compiler that supports OpenMP. But I doubt it's actually getting harder, probably just the rate at which it's getting easier is not the same for everyone.

  67. "for" or "while" loops by wfstanle · · Score: 1

    "Any program with a "for" or "while" loop in which the results of one iteration"

    Think about it... In most real world applications, a for or while loop do depend on some variable in a previous iteration. If something inside a loop does not change you have an infinite loop. Something has to change. In a for loop this is the index variable but in this case, something has changed.

    1. Re:"for" or "while" loops by Haeleth · · Score: 2, Informative

      That's not what he meant. Yes, technically the index variable is changing, and it may also be used inside the loop. But the relevant questions are: "does it matter what order the iterations run in", and "does one iteration have to finish before the next can begin".

      If the answer to both questions is "no", then you can run several loop bodies at once on different processors. Bingo, instant speedup.

    2. Re:"for" or "while" loops by Jedi+Alec · · Score: 1

      "Any program with a "for" or "while" loop in which the results of one iteration"

      Think about it... In most real world applications, a for or while loop do depend on some variable in a previous iteration. If something inside a loop does not change you have an infinite loop. Something has to change. In a for loop this is the index variable but in this case, something has changed.

      perl alert!!!

      while(my $shit = shift(@pile_of_shit))
      {
      smell_like_roses($shit);
      }

      In this example the state of the other slabs of shit don't matter one bit to the one being treated to smell like roses. I'm running 1 production line while I could be running 8 at the same time.

      Honestly, it makes me cry each time I see a game max out 1 core of my i7 and stutter at the hard stuff when all the cpu power is just sitting there unused.

      --

      People replying to my sig annoy me. That's why I change it all the time.
  68. Re:Chapel? by wisty · · Score: 1

    Elvish? Klingon?

    Oh, wait, a human language. Huff.

    Wait, aren't programmers human?

  69. Re:Parallel programming is dead. No one uses it... by bar-agent · · Score: 1

    Well, if an application is written to use threads, then it is both parallel and concurrent, in that the threads could be run on different cores/processors/whatever if they are available.

    So you haven't heard of any games or other applications written to be parallel? If they are threaded, they are written to be parallel.

    --
    i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]
  70. LIBRARIES!! by HiThere · · Score: 2, Interesting

    The main problem faced by each new language is "How do I access all the stuff that's already been done?"

    The "Do it over again" answer hasn't been successful since Sun pushed Java, and Java's initial target was an area that hadn't had a lot of development work. Sun spent a lot of money pushing Java, and was only partially successful. Now it probably couldn't be done again even by a major corporation.

    The other main answer is make calling stuff written in C or C++ (or Java) trivial.Python has used this to great effect, and Ruby to a slightly lesser one. Also note Jython, Groovy, Scala, etc. But if you're after high performance, Java has the dead weight of an interpreter (i.e., virtual machine). So that basically leaves easy linkage with C or C++. And both are purely DREADFUL languages to link to, due to pointer/integer conversions and macros. And callbacks. Individual libraries can be wrapped, but it's not easy to craft global solutions that work nicely. gcc has some compiler options that could be used to eliminate macros. Presumably so do other compilers. But they definitely aren't standardized. And you're still left not knowing what's a pointer so you don't know what memory can be freed.

    The result of this is that to get a new language into a workable state means a tremendous effort to wrap libraries. And this needs to be done AFTER the language is stabilized. And the people willing to work on this aren't the same people as the language implementers (who have their own jobs).

    I looked over those language sites, and I couldn't see any sign that thoughts had been given to either Foreign Function Interfaces or wrapping external libraries. Possibly they just used different terms, but I suspect not. My suspicion is that the implementers aren't really interested in language use so much as proving a concept. So THESE aren't the languages that we want, but they are test-beds for working out ideas that will later be imported into other languages.

    --

    I think we've pushed this "anyone can grow up to be president" thing too far.
  71. Ada (since 1983) by krischik · · Score: 1

    Well there is a non interpreted i.E. compiled language where the language designers did not shy away from tasking:

    http://en.wikibooks.org/wiki/Ada_Programming/Tasking

    And tasking had been there since the original standard in 1983. Only problem: Ada was always considered heavy weight and difficult. In my opinion both is not true - after all the ISO standard for C and Ada are only a few pages apart in size.

    Martin

    1. Re:Ada (since 1983) by HiThere · · Score: 1

      Ada's main problem is that it's quite difficult to use strings. Fixed strings of different lengths are different types, and Fixed string is the default string type. If unlimited strings were the default (and what literal strings were) and Fixed strings were an optimization, this problem would go away.

      Of course, there's still the library problem. Ada can easily call C routines, *IF* the routines expect arguments of a compatible nature. If not... well, it's do-able, but it requires upkeep. Look at the state of Gtk support in Ada. It was done quite awhile ago, and the last time I looked, they were still fighting with an old version of Gtk. Database support would be as bad, but until quite recently NOBODY has done Database support well. (SQL is powerful, but manipulating SQL strings to get at your data is worse than awkward.) I rather like some of the recent approaches I've encountered (e.g., Python's SqlObject), but I don't think that it's a high performance approach. Nothing that involves storing numbers as strings is going to be high performance. Oracle's BerkeleyDB-je approach has promise (class annotations), but again I have a hard time thinking of anything for Java as being high-performance. I suspect that it's just easier to use...not that that's to be sneezed at.

      Ada is, indeed, a high performance language that has had separate tasking built-into it for quite awhile. But it has run into the problem of Libraries and generally ignored it. As a result it has niche superiority and use. It's a rather nice language except for it's problem with strings.

      OTOH, I admit to doubts about any language that doesn't handle garbage collection being usable in a distributed multi-processor environment...except for largely single-threaded applications. Memory management is difficult enough in single-threaded applications. Once you expand to multi-tasking where each task can be multi-threaded... well, I expect memory leaks to become MUCH more common, unless that language itself handles garbage collection.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  72. Re:Parallel programming is dead. No one uses it... by Anonymous Coward · · Score: 0

    Not just 2 CPUs. Can Apache use 2 CPUs, located on different machines to serve one request, almost twice as fast as it can serve the same request using just one CPU?

  73. Why new language, while there are good'old ones... by kafka.fr · · Score: 1

    Can someone explain me why we would need a new language... There are languages designed from the ground-up to allow parallel processing. Yes, structures inside de core language, not some extra libs. Just to name one: Ada, designed to allow parallel computing since before 1979 (yes, that's 30 years ago). Reference: http://www.adaic.org/whyada/multicore.html. Even on a single processor, doing stuffs memory-bound (such as a sort on a huge amount of data) parallelized can show significant improvement.

  74. Inmos had it right by Tjp($)pjT · · Score: 2, Interesting

    In the let the compiler decide attitude of the C language families ... Inmos C had the correct solution. You add two new keywords to the language, parallel and sequential.
    sequential
    {
    stmt1;
    stmt2;
    stmt3;
    }

    as opposed to

    parallel
    {
    stmt4;
    stmt5;
    stmt6;
    }

    The stmt1 must be executed before stmt2 which must be executed before stmt3 in the sequential construct. C languages actually already support this in a bit more awkward way with the ravel operator. But sequential is an easier to understand and read method, and balances nicely the parallel keyword. The compiler and runtime have been told that stmt4, stmt5, and stmt6 can be executed in parallel. There is implicit synchronization at the end of the statement block.

    This is all well and good and many people look and say that it would not be so tough to do this in other ways, and so on. But combine this with fast iterators as are in Objective C 2.0 and it gets much more interesting. Or for the generalized case where any place a left brace is permissible, either of these two constructs could be substituted. This generalizes to braces enclosing a conventional block of statements as exists now, a forced sequential block of statements (so that side affects from say external inputs or other volatile entities can be dealt with at the specific case where needed) or a statement block where the contained statements may be executed in parallel. The programmer still has to have a bit of knowledge here, but the compiler and runtime can really lighten the load. And it does not have a syntax clash with either C, C++ or Objective C so could be adopted by all of them.

    I used this back in 1980s and it was awesomely easy to deal with dispatch of hundreds of lightweight instances. Essentially fibers in a more modern vernacular. By partitioning the work between the complier and the runtime systems I ran the same binary code across quad processor and 64 processor arrays. (Ancillary to this discussion was that Inmos Transputers had also built in message passing on dedicated links in hardware. Of course Fortran was also supported as was Pascal, but the main pushed language was Occam. And hardware timers were there as a data type too to make scheduling a breeze. Processors w/o hardware timers just mimicked them in the runtime. And locks were supported in the hardware as well...)

    The point is this was a elegantly solved problem in the 1980s that was mostly forgotten. It was a simple matter to have the runtime aware of the fabric an individual process could access and just turn stuff loose. But that part is a bit outside the main discussion, like I don't drift enough already!

    --
    - Tjp

    I am in wallow with my inner money grubbing capitalistic pig. ... Oink!

  75. erlang? by grimborg · · Score: 1

    imho you shouldn't be worrying about parallelizing things, the compiler/interpreter/whatever should take care of that. How about erlang for the job?

  76. Written form of Korean was invented by wisebabo · · Score: 1

    The written form of Korean was invented by a King Sejong sometime in the 16th century I think. He hired a bunch of scholars who devised a simplified phonetic language (as opposed to the ideographic Chinese characters they were using). That's why Korean looks really really different from Chinese, the characters are supposedly designed to represent the throat and tongue configuration needed to pronounce the character.

    He forced the populace (under pain of death?) to use it but it worked, literacy rates shot up. All Koreans now use it North and South (but with some Chinese characters used occasionally) and King Sejong is widely revered, I believe he's on South Korean currency (Kim Jong-Il is probably on the North, surprise).

  77. Re:Chapel? by Anonymous Coward · · Score: 0

    Elvish? Klingon?

    Oh, wait, a human language. Huff.

    Wait, aren't programmers human?

    Esperanto?

  78. Overly-pervasive imperative programming at fault? by amn108 · · Score: 1

    A big part of the problem as we have it already is rooted in the ways operate computers, ever since it was the only way to do so (slow, little memory, etc).

    We tell our computers not only what to do but out of sheer paranoia HOW to do it. This is because we are not confident we have taught (programmed) the computer to make good decisions and map the road of solutions from the problem to that what we want computer to do, so we employ languages like C to map out every turn the program blindly has to take, no matter the road it is put on really. As most of the programming world, out of habit and what not, employs most slavish forms of imperative programming, what is the chances that a compiler-translator or an operating system (or the underlying hardware which IS INHERENTLY IMPERATIVE by nature) are able to override the decisions that the programmer itself has explicitly made on its behalf and thus ordered it to follow strictly? Granted, some compilers/translators do have freedom of interpretation, but is is also subject to language specification, I mean if you express A is implemented in a B way, then there is so much the compiler can do and no more.

    To make an analogy, if your teenage son/daughter interns at your law firm as a your private secretary of sorts, when told "fetch me that contract from the finance dep. on second floor and bring me a good pencil from third for signing it" he/she might not catch and comment on the fact that you don't sign a contract with a pencil, and just follow through your order blindly. If you taught him/her the art of contracts though, he/she might become a much better secretary, might eventually replace you as well :-)

    When the programmer assumes he knows most (of all parties involved) EXACTLY HOW the program should solve the problem across time/space, not only for their own testing hardware but for all the combinations of architectures and environments of his programs' users, then those chances are even slimmer. This is program optimization problem, which surfaces when we try to compile our serial source code to run on very parallel systems.

    So here we are, discussing solving ways to parallelize our solutions to common problems, when we are like a one-eyed master who tells a slave not only what to do but also how to do it, instead of educating the slave so that he which has better depth-perception can better guide the master. And I am not talking about sloppy out-of-college programmers, this happens to the very best, because the habit was there for so long, I mean we had to tell the slave what to do because historically, that slave character was much more blind than the master and severely handicapped in many areas.

    In essence, if we put parallel programming paradigms into an imperative language, how is this going to prevent even great programmers from assuming too much? We need to teach computers how to map the solution themselves, with us only specifying the constraints of such solution, or goals of the program. You might say that such assumptions on human part are always a necessity, because we are just not there yet to have sufficiently intelligent HCI translators, but we should try nevertheless, for the sake of solving several problems at once with one broad look at things. Like one guy here said, how sad is that opening a bunch of pages in Firefox that do absolutely nothing maxes out a modern multi-core, superscalar, out-of-order executing CPU. Is it the faul of a) Firefox slavishly told what to do by programmers that wrote its C code? b) Operating system slavishly doing what kernel calls coming from Firefox tell it do? or c) the underlying hardware slavishly doing what the CPU tells it to do? I'd guess all of the above are equally involved. But can you blame either? All three are, by design, doing their job as they are told.

  79. I hate it when ... by Anonymous Coward · · Score: 0

    I hate it when I get asked if I've finished my Forking program by the boss. You think he'd mind his language!

  80. Re:Chapel? by FordPrefect276709 · · Score: 1

    it wasn't "invented" - it was CREATED! (something as beautiful & wide-spread as the English language could only be created by the spaghetti-monster!!)

  81. who has problems with threads? by Bouncelot · · Score: 1

    "Getting the most from multicore processors is becoming an increasingly difficult task for CODERS" - Fixed. Real programmers have no problems handling multiple threads/processes.

  82. Re:Chapel? by Anonymous Coward · · Score: 0

    How exactly was mandarin invented?

    And what exactly do you mean with "the written form"?
    The written form is basically just a way of writing the spoken language, it isn't language on its own.

    I recommend reading for example this . There's a lot of other great stuff to read on the same site.

  83. Interesting article by Banador · · Score: 2, Interesting

    Threads Cannot be Implemented as a Library. That means pthreads is bad. Read: http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf

    Then after a few years, work on Java memory model has found a good solution. Read: Foundations of the C++ concurrency memory model [based on the Java memory model] http://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf

    How fugly can this be for all you C++ wannabe fanguys??? (Phun intended!)

  84. Multitasking by Anonymous Coward · · Score: 0

    Multitasking is a great idea. So great that even Windows does it :-) Now, what happens when you have one program doing some heavy stuff, and another waiting for user input? The one waiting for user input becomes annoyingly slow, to the point where people tend to give up on multitasking, and just let the heavy program run while they go outside to do something else.

    Then some brilliant people decided to fix the problem. If you have two programs running, and two cores, the heavy program cannot slow down the other program (well, as long as at least one of them stays away from the hard drive). Great, now multitasking works at it is supposed to. You don't notice at all, that there is another program running in the background.

    Now, some asshole decides that the one heavy program - the one that you finally don't even notice is running - should be able to use multiple cores. Once again, the heavy program will be using the same CPU as the one the user is waiting for. The system feels slow. But this time, more cores is not going to help. The heavy program is built not for 2 or for cores, but for perhaps 64 or even more cores. So no matter how how big a CPU he can afford, it will still slow him down.

    Why do some people seem to hate when computers get fast enough to multitask smoothly?

  85. The JVM by smbell · · Score: 1

    I'm sure I'll get a sure I'll get the Java is slow and it eats all my memories crap but here it goes.

    The JVM (Java Virtual Machine) is one of the few platforms that have a well defined memory model (Short Description Wikipedia)

    The main problem in parallel programming is dealing with data across different threads, knowing when data written in one thread is visible from another thread, and efficiently communicating between threads. The JVM platform can handle all of this in a deterministic manner, which is key.

    Now i say JVM here because it's the platform, and not the Java language, that makes it all work. Java the language (as of 1.5) has great concurrency support, but there are also other languages built with concurrency in mind from the get go like Clojure and Scala.

    Plus it all works cross platform.

  86. +1, informative by weston · · Score: 1

    Had never heard of Inmos before... and it's generally true that many problems in the computing world have been addressed before elsewhere...

  87. Re:Chapel? by Eskarel · · Score: 2, Informative

    Mandarin is/was the court language of China, it was created in a rather clever way.

    The country, being very large had a number of different dialects. Mandarin was developed so that while the spoken word might be different in different areas of the country the Mandarin text was identical regardless of where you were. It was a language of scholars and regular people never really learned it(they didn't need to). The spoken form wasn't used much by anyone who wasn't a courtier(though neither was the written form).

    For example, Peking and Beijing are the same place, Peking is what they call(ed) it in the south, and Beijing is what they call it in the north, the written mandarin for both words is the same.

    To a certain extent nearly all written languages were initially created because so very few people actually wrote in the early days that there wasn't really any sort of natural evolution for a lot of written languages. For a more specific example, during the early years of the soviet union when the soviets were encouraging the development of their ethnic minorities(as opposed to kicking them off their land and putting them in work camps as they did a few years later. The Soviet government actually sent linguists out to some of the more nomadic of these groups to develop a written form of their language, which prior to this effort had never existed. That's not even counting resurrected languages which likely bear no significant resemblance to their previous forms, but which are spoken by real live people, or made up languages like Klingon that regular folks .

    Languages are both created and naturally evolve, and written languages and spoken languages do not always begin at the same time, are not always used by the same people, and are sometimes rather arbitrary.

  88. Parallel Processing - see FPGA by surdumil · · Score: 1

    I think that the whole parallel processing notion is being stood on its head by logic developers working on embedded FPGA apps. The languages of choice are mostly based on VHDL and verilog. The processors involved can be DSP hardware blocks, embedded processors (implemented hardcore or softcore logic), and an arbitrary number of custom state machines, all operating independently or in various locked-step arrangements. Communications between processors are defined as required. Latest FPGAs offer a couple of thousand DSP blocks implemented in hard logic, all with localized memory stores. They can be arbitrarily grouped and/or can run independently, and can be clocked at different rates according to requirements. The resulting parallel processing power and versatility is astounding.

  89. Re:Parallel programming is dead. No one uses it... by twopoint718 · · Score: 1

    Agreed. Looking at how some computing resources are set up, there is as much (maybe?) of an emphasis on high-throughput computing as on parallel computing ( http://en.wikipedia.org/wiki/High-Throughput_Computing ). As in "how do we make all these long-running jobs complete reliably and so as to use the available hardware efficiently?"

  90. CSP is the right way to do Multi-Threading. by ralph.corderoy · · Score: 1

    Any discussion of parallel programming would benefit from have read and understood the resources and history covered by Russ Cox at http://swtch.com/~rsc/thread/

  91. Re:Inmos had it right - not by Animats · · Score: 1

    sequential as opposed to parallel

    That's been tried many times, especially in concurrent FORTRAN variants. The problem is that it doesn't help with locking and interlocking unless the compiler is able to determine that "parallel" tasks really are independent. In some number-crunching applications, that's possible. If the "parallel" tasks are allowed to have any access to shared data, more structure is needed in the program. It's easy to say that you want something done in parallel. The problem is detecting race conditions, providing the locking necessary to deal with them, and figuring out the invariants of the shared data.

    Many, many ideas have been tried in this area. Most of them break down when there's a need for complex interaction between threads, as in a window system, browser, or MMORPG server. Partitioning of problems like inverting big matrices in parallel is well understood. But that's not why most people get multi-core CPUs today.

  92. Insightful Answer by krischik · · Score: 1

    As an Ada advocate I often get crap answers making Ada down. But your answer is better - you know what you are talking about.

    The String point is indeed valid. We have to thank the high integrity guys who wanted language which can be used without heap memory - all data on stack.

    Same goes for garbage collection - GC is actually advocated by Ada - but the embedded guys wanted it optional. Note that Ada has a distributed extension as well - which too should offer GC.

    And the C interface - well yes it is difficult for a high integrity language it interface with the array is pointer mess and still try to keep up at least a bit integrity -do then remember: most viruses and trojans exploit precisely those C weaknesses which Ada does not like.

    A Desktop Ada would be nice which does have garbage collection - and perhaps easier to use string. But it won't happen - it is not cool enough, it's difficult to design and develop and also all new languages seem to be interpreted.

    Martin

    1. Re:Insightful Answer by HiThere · · Score: 1

      It's much easier to write portable interpreted languages, (But I don't call them "High Performance".)

      One interesting language I haven't mentioned is Eiffel. It's a language that COULD have been a contender. But again, it's hamstrung by it's lack of libraries and the clumsy C interface. (Actually, I understand that it's C interface is improving...but it's probably too late.)

      I expect that what will arise, finally, is a modified version of C or C++ that restricts pointers and doesn't allow conversion between pointers and integers. I may or may not like it, but it will be comfortable enough to enough people. It won't have a robust type system like Ada has (pity...mostly), because not enough people see the value. It might expand on the syntactical uses of enum types, though. It might have range types. It had *better* have garbage collection, or, as I mentioned previously, it won't solve the problem. (I think C++ has pushed that off until 2012 or later, but I'm not real sure.)

      Note that steps have already been taken towards all of these features within either C or C++. (See various boost library features, and some recent syntax adoptions.) The syntax that is used is GARBAGE!!!, but the features are there, which makes a syntax translator a reasonable thing to create. Then you could have a front-end to C or C++ that would itself be a decent language, and which only needed to make an isomorphic transform of the code to become compileable. (That's a one way transform, of course. The transform may be one-to-one, but it wouldn't be onto. There's no safe translation for freely convertible pointers.) This neo-C would be something that it was much more reasonable to write a garbage collector for, and it would be designed from the beginning to allow easy integration of C or C++ libraries.

      My feeling is that it would be very simple to create such a language NOW. What would be difficult would be convincing people to use it. The basic idea is to allow references, but to forbid pointers. That and a few other changes, like eliminating the preprocessor, but replacing it with safer methods for accomplishing the same things...or such of the same things as are reasonably safe. I think that all macro equivalent expressions should themselves be required to be syntactically correct. (Maybe that's been done. It's been a long time since I did extensive programming in C...especially of other people's code. And I avoid the use of macros in my own code.)

      FWIW, I've done most of my programming recently in Python or D...pretty much depending on whether I needed access to extensive libraries. Now I've hit a place where Python's to slow and D doesn't have the libraries...so I'm switching to Java. I find C's use of pointer's too appalling to tolerate.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  93. OpenMP by hitchhacker · · Score: 1
    What you describe seems very similar to what OpenMP provides for recent C/C++ and Fortran compilers. It is probably available in your C/C++ compiler already.

    #include <omp.h>
    #pragma omp parallel
    {
    // parallel block
    }

    OpenMP Sample Programs

    -metric

  94. Re:Chapel? by Anonymous Coward · · Score: 0

    I suspect the commenter is trying to make something of the first dictionaries in english which standardised spelling. Someone had to decide, (invent ?), spellings for these books. Obviously the spelling wasn't locked down perfectly but modern differences are smaller than those that people dealt with in the past. I suspect that translating the bible into a language spoken more commonly was a motivation among the protestant reform movements more than the other more traditional sect but was this the principal motivation? I find that hard to believe (though I don't really know).

    Also, probably the commenter was joking.

  95. COM to the rescue by Anonymous Coward · · Score: 0

    The easiest way I've found to take advantage of all the user's cores:

    Build COM server exes. Automatically call UnregisterClassObjects() after the first object is created, so the next attempt to create an object creates a new process. Now I have:
    1. Multiple processes running in parallel on multiple cores.
    2. An easy programming API (the COM interface).
    3. No unintended memory sharing, so very little chance of race conditions.
    4. A very testable surface. I can write test code in VB that exercise the COM objects.
    5. Ability to write different components in different languages.

    Nothing like 1990's technology no solve today's problems.

  96. Re:Chapel? by jpate · · Score: 1

    human NATURAL language, which (more or less by definition, actually) means a language which was not consciously designed by humans.

    This refers to the vast majority of languages (including modern english, as noted by GP) that people use to communicate with eachother on a day-to-day basis (i.e. excluding math and logic journals)

  97. Win32 SLEEP API call = good for loops, in TS by Anonymous Coward · · Score: 0

    "So while the application isn't idle (eg crunching large datasets) the UI is unresponsive. Gotcha, that really solves the issue." - by Anonymous Coward on Monday June 08, @03:09PM (#28254419)

    You probably made the mistake I & others made a decade++ ago then: You didn't put in 'timeslicing' into your loops in your DB apps... how does this matter? Read on:

    It's GOOD, say while "crunching datasets", to put in a SLEEP API call in loops (especially during returned recordsets populating controls)

    AND, the really nice part about the SLEEP API call in Win32 is, is that you can set the "sleep interval" itself!

    (I.E./E.G.-> lowering it for some more "important users", based on their logon credentials which you can read from a "users table", & secure DB side... just so it works faster for they, vs. those that are "less important" who would get a higher rate inserted, making it slower for them, basically)...

    This comes into play MOSTLY in my experience, with returned recordsets from say, Oracle &/or SQLServer & populating grids or listboxes & yes, it can cause problems in TERMINAL SERVER/CITRIX run multiuser + multicampus scenarios...

    To solve it? Yes, the sleep API call worked, oddly when DoEvents would not!

    (Myself, & 3 others coded that information system, 1.5 million lines of VB code alone not counting server-side SQL, in VB6 to Oracle (PL-SQL) via both OO40 for writes, & native MS middleware in ADO for reads + to this day, 10++ yrs. later? It has been running without major modification running a major companies sales, shopfloor, & inventory info. dead-solid perfect + "bulletproof & bugfree" (yes, one middleware was faster than the other, hence using 1 middleware for reads, & the other for writes)).

    Without that SLEEP api call though in loops that took returned data & populated grids & such (RAD controls)?

    The app WOULD HANG, & across every remote user across Terminal Server connections (shared line) over 25 miles away @ the companies' other remote location (factory part, whilst we were @ the administrative part most of the time).

    APK

    P.S.=> Between that Win32 API "SLEEP" api call, & using application idletime? You can do what was asked for, without threads... & just letting them go "high torque" ALL THE TIME, especially over TS shared sessions? A recipe for disaster... hung apps/hung shared sessions galore! apk

  98. Re:Established vs new programming languages for HP by timq · · Score: 1

    That is to say, if Firefox is busy loading a page, you can still click on the menus and get a response. Different aspects of the GUI are handled by different threads, ...

    Except that firefox isn't multithreaded.

  99. Re:Parallel programming is dead. No one uses it... by DragonWriter · · Score: 1

    We still have not solved more complex parallelization tasks. For example, GUI is still mostly single-threaded.

    This isn't because its a hard parallelization task, but because GUIs are generally user-bound, not CPU-bound. Until you parallelize the software running on the unit between the chair and the keyboard, you don't need to worry too much about parallizing the GUI.

  100. Re:Chapel? by DragonWriter · · Score: 1

    No widely spoken human natural language was "invented,"

    There are too many qualifiers in this statement. It should simply be "no natural language is invented", since that is true by definition of a natural language. An "invented" language like Esperanto, Klingon, Lojban, or Elvish is a "constructed language", not a "natural language".

  101. Re:Parallel programming is dead. No one uses it... by DragonWriter · · Score: 1

    I'd be skeptical of allowing smart compilers to do the programming for us, a compiler no matter how smart isn't the programmer.

    All a compiler does is write programming code, given an input which is also programming code (usually, in different languages.)

    So, if you don't want a compiler to do that for you, do you hand code all your object code, and then just run a linker?

  102. Re:Established vs new programming languages for HP by Anonymous Coward · · Score: 0

    Shared-memory architectures don't scale -- the biggest ones you can buy have about 64 CPUs.

    Try telling that to SGI :-) [see their Altix 4700 line]. But you are correct in that it is hard to scale up shared-memory machines, at least to the scale used by anything on the top500 list.

  103. Re:Parallel programming is dead. No one uses it... by Cyberax · · Score: 1

    Nope. It's because UI _is_ a hard task to parallelize.

    For example, a lot of programs are still single-threaded. Even the ones like Microsoft Word.

  104. Effective "parallelism" by using Unix's strengths by Nutria · · Score: 1

    While I pretty much agree with you, a few comments:

    But for suitably complex spreadsheets, I can imagine some accountants wishing there was some parallel processing done.

    If the spreadsheet is that complicated, then a spreadsheet is probably the wrong tool for the job. But that has to do with how you use spreadsheets, not whether they are parallelized.

    I curse at kmail for not being parallel. Sometimes when it runs spamassassin against incoming email, it can get annoyingly slow

    That stems from a PC mentality which makes the MUA do double duty as both MUA and MTA. Remember that "Unix" is not a "desktop" OS, but as a do-all OS, happy to simultaneously run "desktop" and "server" apps.

    Install fetchmail/gotmail, an MTA (I like postfix, but exim4 works just as well), spamassassin, an MDA (I like maildrop, but procmail is more popular) and an IMAP server (I like courier, but dovecot is also popular).

    Once you have it all integrated (it's not as complicated as you might think), with new mail being dropped into ~/Maildir (I have a cron job that runs fetchmail every 5 minutes), then config kmail to also look at the "localhost" IMAP server. (Also, there are scripts on the 'net to convert mbox files to Maildir format, or you could drag-drop using kmail.)

    Thus, in practice, you have "parallelism" even though the apps are single-threaded.

    (This scheme also allows you to switch MUAs or access your mail from a different machine. (For example, I sometimes read my mail from my wife's Windows box, and one time when Debian Sid broke X, I was still able to read my email with Mutt from the console.)

    --
    "I don't know, therefore Aliens" Wafflebox1
  105. Re:Chapel? by Touvan · · Score: 1

    I think they come up with this stuff based on their narrow one sided public school education - English class, which teaches them the prescriptive rules of "Standardized English". That really was invented, and now it's taught to all students (in the US at least), and our culture dutifully squeezes every gaff into that narrowly prescribed rule set.

    IMO, all public schools should be running linguistics programs in parallel with there English classes. It's important, because the way language works linguistically, is more accurate and useful when you want to really understand something. The culture here could really benefit from more basic understandings of many things - including the way language really works and evolves.

  106. Re:Chapel? by Touvan · · Score: 1

    You know I looked it up, and apparently I had some basic facts wrong about the origin of "Standard English", but my point still stands I think. ;-P

    http://en.wikipedia.org/wiki/Standard_English

  107. Re:Parallel programming is dead. No one uses it... by DragonWriter · · Score: 1

    Nope. It's because UI _is_ a hard task to parallelize.

    No, again, its because users aren't parallel, so there is very little to gain by parallelizing the UI.

    For example, a lot of programs are still single-threaded.

    Yes, and because users only do one thing at a time, that isn't a problem.

    Even the ones like Microsoft Word.

    And Microsoft Word won't benefit from a parallelized UI until you have users trying to click on menu buttons at the same time they are typing text.

  108. Re:Parallel programming is dead. No one uses it... by Cyberax · · Score: 1

    "And Microsoft Word won't benefit from a parallelized UI until you have users trying to click on menu buttons at the same time they are typing text."

    Or maybe, you know, until user starts spell-checking in the background? Or maybe pagination of a large document?

  109. Re:Parallel programming is dead. No one uses it... by DragonWriter · · Score: 1

    Or maybe, you know, until user starts spell-checking in the background? Or maybe pagination of a large document?

    Those could benefit from parallelization of the engine, but there not so much the UI, where a fairly traditional single-threaded event-driven UI would probably do nearly as well as a multithreaded one, even with a parallel backend.

  110. Re:Parallel programming is dead. No one uses it... by wirelessbuzzers · · Score: 1

    Well, if an application is written to use threads, then it is both parallel and concurrent, in that the threads could be run on different cores/processors/whatever if they are available.

    Not really. Most threaded apps have at most one CPU-intensive thread. The threads are about concurrency, not parallelism... you just want to make sure that mostly-independent parts of the app don't block each other. Usually either everything is waiting for input or just the CPU-heavy thread is running. These apps benefit very little from multicore.

    For example, Firefox is threaded (poorly), but it isn't particularly parallel: it rarely or never brings the CPU over 100%, so it can't take much advantage of a dual-core CPU. It could theoretically be made parallel by pipelining or otherwise parallelizing the renderer, or making it run in parallel with the Javascript engine, but these are hard enough and gains small enough that they won't happen anytime soon.

    --
    I hereby place the above post in the public domain.
  111. Re:Parallel programming is dead. No one uses it... by bar-agent · · Score: 1

    What you are describing as parallel is simply more concurrent. You don't need a separate word. It just confuses people.

    --
    i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]