Time to Get Good At Functional Programming?
prone2tech writes "From an article at Dr. Dobb's: Chipmakers have essentially said that the job of enforcing Moore's Law is now a software problem. They will concentrate on putting more and more cores on a die, and it's up to the software industry to recraft software to take advantage of the parallel-processing capabilities of the new chips. As is argued in this article, this means becoming proficient in parallel functional programming. The bad news? Getting good at functional programming is hard, harder than moving from iterative Pascal or Basic or C coding to object-oriented development. It's an exaggeration but a useful one: When you move to FP, all your algorithms break.'"
You mean oo isn't the only option?
Modding me -1 troll doesn't make me wrong.
When you move to FP, all your algorithms break
If moving to a functional programming language breaks your algorithms, then you are somehow doing it wrong. That line doesn't even make sense to me. Algorithms are mathematical constructs that have nothing to do with programming paradigm. Assuming the language is Turing complete, how is that even possible?
I don't see why an algorithm would break just because you're changing language type, the whole point of an algorithm is that it's programming language independent.
Question is, how realistic is that?
Amdahl's Law also tells us tells us that the amount that parallelization can speed something up is ultimately limited by the parts that can't be done in parallel.
(have I (feeling ((become popular Scheme) again)))
Colin Dean Go a year without DRM
It's been said in the comments on slashdot many times. Learning functional programming techniques will improve your programming skills. There are many good functional languages out there, and many have imperative features for ease of transition. No functional will not solve all of your problems, but it will give you that most valuable of all lessons, how to think about a problem _differently_.
You don't need an excuse, start today.
Absolute statements are never true
This reminds me about the /. article "Twenty Years of Dijkstra's Cruelty", just a few days ago.
Problem boils down to fact that programming is in fact a very advanced calculus. And writing a program is 'deriving' it. As in reaching a correct formula with a proof that it's correct. That's how software should be written anyways. And functional programming will only make it *simpler*, not harder.
#
#\ @ ? Colonize Mars
#
So a quick question before I go and master FP...does the compiler automatically compile the code that can be done in parallel in the proper "way", or do I have to specify something?
Also, if I rewrote an app written in an imperative language to a FP one like Haskell, would I see a that much of a difference on a multi-core processor?
How is maintaining the rate of increase in the number of transistors that can be economically placed on an integrated circuit a software problem?
The biggest problem with functional languages tends to be their type systems (I'm looking at you, Haskell). A functional language with a nice type system, like Erlang, is easy to pick up. And the example I picked totally at random, Erlang, also happens to have CSP primitives in the language, which makes parallel programming trivial. I've written code in it and then deployed it on a 64-processor machine and watched it nicely distribute my code over all 64 processors. If you program in a CSP style (which is easy) then your code will exhibit 1000-way parallelism or more and so will trivially take advantage of up to this many processes.
And, actually, object orientation is a good option too. Alan Kay, who defined coined term, defined it as 'simple computers [objects] communicating via message passing' - sounds a lot like CSP, no? The main difference is that OO is usually implemented with synchronous message passing, but you can implement it with asynchronous messaging (actor model) then you have something almost identical to CSP. You can also add this implicitly with futures. I've done this in Objective-C for Etoile. Just send an object an -inNewThread message and any subsequent message you send to it is passed via a lockless ring buffer to the other thread and executed. We use it in our music jukebox app, for example, to run the decode in a separate thread from the UI. Implementing it in the language, rather than the library, means you can do it more efficiently, so this by no means replaces Actalk or Erlang in the general case, but modern processors are fast serial processors so it makes sense to program much larger chunks of serial code on these systems than Erlang or Actalk encourage.
I am TheRaven on Soylent News
Hmm, strange, I thought I had logged on. You gotta love web apps. :]
Ezekiel 23:20
Lisp! NO!!!!!!!!!!!!!!!!
Rolls over and dies...
(added to make filter happy)
This space is not for rent.
I've recently gotten into FP. I started with Erlang and then branched into ML and Haskell. In case you're interested, here are the best books I've encountered for each language:
Programming Erlang
Programming Haskell
ML for the Working Programmer
Also, I'd definitely recommend starting with Erlang, because the Programming Erlang book made for a very easy introduction to functional programming.
http://www.ni.com/multicore/
A. Many programmers start writing or re-writing their code in functional programming languages.
or
B. Programmers continue writing to their platform of choice, e.g. .NET, Java, etc., and the guys writing the virtual machines do the heavy-lifting, making the VM execute more efficiently with multi-cores?
I'll go with B.
Apple is already proving this. Mac OS X Snow Leopard will have a lot of this built-in. Read about "Grand Central."
Ironically, the word ironically is often used incorrectly.
Look at the table of contents of this BYTE magazine from 1985. In a nutshell it said the same thing as this article: Functional languages are the great hope for solving the parallel programming problem. Only then the languages were different: Hope, Linda, and Prolog were among them.
My response back then was to get excited about FP. My response now is: Where is the proof? Can anyone name a single instance where a functional paradigm has yielded the best measured performance on a parallel computing problem? In other words, take the best functional programmers in the world, and pair them up with the best tools in existence. Can they actually create something superior, on any problem running on any hardware? This is a very low bar, but until it's demonstrated FP will be confined mostly to the lab.
IMHO the path forward is to treat parallel programming like just another optimization. As we know, the vast majority of your code doesn't need to run fast, and you get most of the performance benefit by optimizing small bits of code that really matter. I suspect the same thing will happen with parallel programming: In a given application only a few areas will benefit much from parallelism, and these tasks will probably be very similar across applications. Graphics rendering, large matrix math, video encoding/decoding, and speech recognition would be examples. People will treat these as special cases, and either develop special-purpose hardware (e.g., GPUs), or libraries that encapsulate the nitty-gritty details. The interesting question to me is what is the best runtime model to support this.
As an example of the learning curve, I wanted to learn a little OCaml. I played around with this insertion sort example. I used it to sort a list of 10,000 integers, and it took 10 seconds, versus <1 second in C with linked lists. Not too horrible. But changing it to 100,000 integers made it die with a stack overflow, so I'm guessing that its memory use goes like n^2. However, it's not at all obvious to me from looking at the code that this would be the case. I think if I wanted to do a lot of OCaml programming I'd have to develop "FP Eye for the Straight Guy." Probably if you wanted to make it perform better on big arrays you'd want to make it tail-recursive, but it's not totally obvious to me from the code that it's *not* tail-recursive; although the recursive call isn't the very last line of code in the function, it is the very last thing in its clause...?
I know of at least one well known OSS project in Haskell, written by a very smart guy, that is really struggling with performance issues. I wonder whether bad performance is to FP as null-pointer bugs are to C. Sure, a sufficiently skilled programmer should theoretically never write code that will dereference a null pointer, but nevertheless my Ubuntu system needs a dozen security patches every month, many of which are due to null pointers, buffer overflows, etc.
Find free books.
While pure functional code isn't allowed to manipulate mutable, shared state, functional languages often provide some mechanism to mix "pure" and "impure" (stateful, imperative code).
In the haskell world, there is the IO monad, which is sort of a sandbox where anything is allowed. Functions within the IO monad (often called "IO actions") are allowed to invoke other IO actions or call pure code, but the reverse is not true; pure code cannot invoke an IO action. Also, due to laziness, pure functions that were passed an unevaluated "thunk" as an argument might trigger deferred IO, but this is (usually) transparent to the programmer.
So far, this doesn't sound any better than a pure imperative language, but there is also an STM monad (for software transactional memory) which is like pure code except that you're allowed to access shared mutable data through a restricted API. STM is based on the idea that if two processes race and manipulate the same shared data structures at the same time, the conflict can be detected by the run time system, which can stop and replay the transaction one after the other.
The reason STM transactions can be safely replayed by the run-time system is that the language guarantees that the STM transaction doesn't have any hidden state somewhere, that might cause problems if the transaction were replayed. This is not a guarantee you can make in C, C++, Java, or any other popular language that I am aware of.
Note 1: It is possible for STM transactions to livelock if they continually conflict and are replayed, so you can still shoot yourself in the foot. However, it does make avoiding certain other problems much easier.
Note 2: I'm not really a haskell guru, so everything above should be taken with a grain of salt. Real World Haskell has a chapter on STM, which is the basis of my current understanding (I haven't yet had cause to use STM in any program I've written.)
Pure functional programming removes all side effects. This make memory optimization (critical to efficient multiprocessing) much easier. It also makes garbage collection easier - but that is pretty much canceled by an increase in garbage.
But beyond functional programming is thermodynamic computing. This starts with functional, but requires all operations to be reversible. Ideally, the total electrons are conserved - you can never clear a bit - just exchange bits (and of course more complex operations like add, mul, etc - but all reversible and charge conserving). Of course real hardware will still need to make up for losses, but power consumption and heat go way down.
The fascinating thing is that thermodynamic programming requires a pool of known 0 bits and known 1 bits. As the algorithm progresses, you can't just throw away results you aren't interested in - you collect the unwanted results in an entropy pool. Eventually, you run out of known bits, and need to clear some entropy bits in order to continue. This takes lots more power (like erasing a flash block). The analogy to real world entropy is striking.
Parallel algorithms are fundamentally different from sequential ones. Take sorting. No multi-threading is going to help you if you keep implementing quicksort. While many problems are inherently parallel and it is easy to undo their serialization, several others will turn into bottlenecks. I am almost done with my Ph.D. and still I haven't received a proper education in parallel algorithms. It'll take a whole new generation of CS teachers to make the grand paradigm shift.
To do list for Windows
Many functional languages do not allow the program to manipulate shared state, which removes that whole class of problems; information sharing is limited to letting the run time system fork threads and copy function arguments and return values back and forth.
However, if you really do need shared mutable state (and there are plenty of algorithms that do), in Haskell there's software transactional memory, which provides a restricted API for manipulating shared mutable state with the STM monad. Since the code within an STM transaction is not allowed to cause side effects or access any shared mutable state outside of STM, the runtime system is able to stop and replay transactions whenever it detects conflicts. This is one area where FP actually can make things easier, by enabling the same sorts of operations you might do in an imperative multithreaded program, but doing it in a much safer way.
And, if you don't need shared mutable state, sometimes parallelism can be achieved just by replacing "map" with "parMap" in the right places and recompiling with the appropriate options. It doesn't get much easier than that.
Auto-parallelization of functional programs has been proposed for decades now, and every attempt has fallen on its face as the overhead has killed any gains. Current parallel FP research isn't even putting that much effort into auto-parallelization, because most PLs researchers consider it a dead end---taking a random FP and evaluating all its thunks in parallel as futures, or some similar mechanism, is not going to solve the problem.
Instead, most of the current research is in programmer-level primitives for designing and specifying inherently parallel algorithms. There is some of this in both the FP and non-FP communities.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
First, the scope for mono-processors is now strictly limited and we _will_not_ see x2/18 months again, there may be x10 to x100 possible within this basic technology but thats just a few Moor's law cycles; second, the (commercial) problems, as described elsewhere, involve the solution of partial differtial (Heat, Navier Stokes, Elasticity) Equations or Stochastic Simulations --- all of which are inherently clusterable, but not gridable).
It is the cache coherency and memory bandwidth problems with existing architectures that are the problem. We need better low latency data transfer and significant improvement in auto-parallelism technology in compilers.
It should be clear that there has been very little serious investment in basic compiler technology and that is now needed. Academics have realised this but it takes time. The bandwidth issues are solvable else-when with more transistors.
Finally, we have a variety of programming paradigms OO, Functional & procedural and more each of which has a problem niche.
One thing we will certainly have to get away fom is the idea that 'legacy' code can carelessly be re-written in the flavor of month interpreted language eg Java, C#, Perl, Python or Ruby. You can write 95% of your code in a programmet friendly language. But the critical sections need to be in C, FORTRAN or Assembler and need to be very carefully optimized. That can give you x100 on the same architecture.
>>When you move to FP, all your algorithms break
>If moving to a functional programming language
>breaks your algorithms, then you are somehow
>doing it wrong. That line doesn't even make sense
>to me. Algorithms are mathematical constructs
>that have nothing to do with programming
>paradigm. Assuming the language is Turing
>complete, how is that even possible?
You are confused about the definition of an algorithm, and the significance of Turing completeness.
First of all, an algorithm is a *way* of doing things with an associated complexity specification (a mathematical description of how long it will take to run often denoted like O(n)).
Two turing equivalent machines don't necessarily support the same algorithms, although they will always have *equivalent* algorithms that get the same job done. HOWEVER, those algorithms don't necessarily have the same complexity. For instance, on turing machine A a sort might be done in O(n^2) while on turing machine B a sort can only be done in O(n^3).
To be functional means to be stateless. If you don't have state, then all sorts of algorithms become much more expensive. Notably, it's impossible to do a quicksort in a functional language, although other less efficient sorts may be done. Some people respond to that by saying that you can just buy a faster computer if you want to run functional algorithms; however, anyone with a decent computer science education knows that this can't solve differences in assymtotic complexity.
NOTE: quicksort (which cannot be done functionally) does not have better worst case (big O notation) complexity than mergesort (with can be done functionally), but it does have best average case and takes advantage of the underlying machine implementation much better. In some ways it is a bad example, but most people are familiar with sorting, whereas few people are familiar with dynamic algorithms.
The reason that functional programming languages exists goes back to Church and Turing. Church invented lambda calculus, and Turing invented Turing machines. Both are computationally equivalent in their power.
Turing machines have state, and are essentially a description of a hypothetical machine. Lambda calculus, is well, a calculus. It is functional in nature and has no state.
Not surprisingly, real world computers look more like turing machines than they do Lambda calculus evaluating machines. Also, virtually all programming languages are built around state manipulation, since that's what the underlying hardware has to do.
The idea of a functional programming language is to emulate the lambda calculus on a rough approximation of a Turing machine. Technically it's possible for any Turing equivalent machine to emulate any other. However, since the two machines are so different, this makes things dog slow. Again, faster computers don't solve this problem because there is an assymtotic difference in complexity, not a constant factor difference.
You seem to have some serious misunderstandings here.
Uh, no. By removing side effects functional programming removes the need to copy anything. If I'm trying to evaluate f(X) + g(X) for some complicated X, f, and g by evaluating f(X) and g(X) in parallel and adding the results, I don't need two copies of X because I know that neither f nor g will modify it. That's the whole point.
It only seems counter intuitive if you've swallowed the procedural programming paradigm and adopted it as your own to the point where you've forgotten how counter intuitive "X = X + 1" seemed at first.
And saying it's non-deterministic is just nuts. Sure, you could add non-deterministic semantics to any language, but there's nothing inherently non-deterministic about functional programming. In fact, I think you'd typically have to work a lot harder to make a functional language non-deterministic.
FP has nothing to do with threads, apart from the fact that functional programs could be executed by a large number of threads in parallel (or independent cores, or...?) without changing the outcome. And what exactly is the mess we're in? I can't think of another industry that has succeeded so spectacularly in such a short time.
And so on...did I just feed a troll?
--MarkusQ
Huh? Quicksort is pretty easy to parallelize, due to its divide and conquer nature: it splits its list into sublists and recurses on those sublists.
Python has some features inspired by FP languages including Haskell, but is not anything like real functional programming. Haskell is far more powerful and serious, but also a heck of a lot more difficult to use. Python has a "practicality beats purity" mantra; you could think of Haskell as "purity strikes back".
Stuff Haskell gets you:
Serious native-code compiler whose output can beat compiled C code for some programs (and does darn well on average, see the Alioth shootout)
Ability to use multi-core parallelism, with a library module that treats shared memory as a transactional database, allowing use of shared data between threads while getting rid of all the lock management headaches of languages like Java. This can work because Haskell's functional purity guarantees that the threads won't clobber each other except under circumstances precisely controlled by the library.
Data parallelism allowing computations of list comprehensions to automatically be done in parallel on multiple CPU's
Rigorous type system (nothing like the broken ones like in C or Java that you might be used to) lets you express complex invariants in your datatypes so that errors can be caught by the compiler. This greatly decreases the amount of runtime debugging required.
I could go on, but you get the idea.
Good tutorial: http://learnyouahaskell.com/
More detailed book (full text online): http://book.realworldhaskell.org/
Haskell has a very steep learning curve compared with other languages (or "unlearning curve", as some put it, since you have to forget everything you knew), but learning it (a still ongoing process for me) is one of the most interesting and mind-expanding things I've ever done as a programmer.
He's probably talking about Markov chain Monte Carlo, not the simple Monte Carlo where you just draw random numbers. You can run a bunch of Markov chains in a parallel manner with no communication, so MCMC is perfectly parallelizable in that sense. However, that often doesn't help you, since the main problem is usually a slowly mixing chain. Running N poorly-mixed chains isn't much better than running one. What you'd really like is to speed up a single chain. There are parallel MCMC algorithms out there that try to swap members from parallel chains to speed up each individual chain, but they're not always worth the effort.
The difference between 1985 and today is the end of the free lunch in clock rates. 1985 was roughly the middle of the last great clock rate stall - during the shift from minicomputers to micros, Moore's Law expressed itself not as speedup, but as decrease in cost per bit/gate.
Then we entered the great die-shrink and clock rate increase era, where for about 20 years processors really did get twice as fast every 12-18 months. Why expend the effort to bury a problem in parallel hardware when you can see faster serial hardware coming over the hill?
Clock rates have stalled again, and we're reaching physical limits for our current fabrication methods and physical chip designs. We're seeing renewed interest in functional programming because it looks like a way to make use of parallel hardware safely compared to imperative coding. Traditional coding methods are easier to understand, and are probably more efficient when they work, but...
how fast do you want the wrong answer?
To a Lisp hacker, XML is S-expressions in drag.
Let's say you have a few thousand (name, address) pairs and you want to be able to quickly look up a name to get the corresponding address, to add new names, etc. In imperative programming you'd probably use one of the mainstay data structures of CS 101, the good old hash table. To add a new name, you hash it and go and poke that address in the table to record the entry.
Well remember that stuff about values in functional programming being immutable? Right, no hash tables in functional programming. You'd instead use something like an AVL tree or red-black tree, that let you create a completely new structure that shares most of its content with the old one, except that the new one has this extra node. Of course FP language libraries come with modules for making those structures, and in practice you can use them at the API level sort like how you used to use hash tables, but they are completely different underneath, and if you want to program them yourself you are going to have to learn a lot of very basic techniques from scratch all over again. Chris Okasaki's book "Purely Functional Data Structures" is a good place to learn about this stuff in detail.
Even more basic: the good old "for" loop, which updates an index variable each time through. Whoops! You can't update the index in a functional language, so there's no "for" loop. You instead use recursion, or a "higher order function" (function that operates on other functions). So instead of
for (i = 0; i < n; i++) xs[i] = f(ys[i])
You'd write something like
ys = map f xs
("map" takes a function f and a list of values xs, applies the function to each item in the list, and gives you back a new list). There is also a "list comprehension" syntax that you might know from Python:
ys = [f(x) | x <- xs]
but for complicated functions you end up having to use higher order functions and recursion explicitly. You really have to think a lot harder to program 20 lines of Haskell than 20 lines of C. But those 20 lines can do an order of magnitude more.
(Aside:) In case you were wondering, yes, you can implement traditional hash tables and other mutable structures in functional languages, and there are times when it's necessary, but it's comparatively a pain in the ass and you give up some of the advantages that had you programming functionally in the first place. Here is an article about someone's experiences switching from a mutable structure to a functional structure in a large program, and the headaches the functional structure solved:
http://www.cs.tufts.edu/~nr/pubs/zipcfg-abstract.html
Not in a functional language, particularly a pure one. This just ends up looking like map/fold or cata and ana morphism calls.
All of the reasonable ways of throwing transistors at getting faster straight-line code execution have already happened. Hell, even the unreasonable ones have been implemented, like fast 64-bit division units.
You, and the chipmakers, have apparently become stale.
There have been claims like this made throughout history. The patent office was closed because of this, bill gates once declared a maximum filesize we'd ever need, and the list goes on and on.
If today's major chipmakers are too lazy and uncreative to come up with new ideas, then academics and entrepreneurs will come and eat their lunch.
VLC FOR MAC IS DYING! IF YOU DEVELOP, PLEASE SAVE IT!!
IMO another reason functional languages haven't taken off is because most lack decent (and well documented) _standard_ libraries for doing many common things, and even if there are decent libs there's typically no way to know which is the standard one to use.
;) ). So similarly who wants to deal with 10 different database abstraction modules, or 10 different networking libraries, or 10 different signal handling libs? Or worse, having to write and debug from scratch your own lib after spending time to find out the libs out there are crap.
Between picking a programming language that is powerful (expressive/concise) for all the code you have to write (insert random FP language here) and picking a programming language that is powerful for all the code you don't have to write (perl, python etc), I'd pick the latter in most cases.
Even if an FP language is 2x more concise than perl for a given task (AFAIK it isn't), without lots of decent libs, you usually have to write a lot more code.
The more code you have to write, the more code someone else later on has to read and understand. And worse the more code that has to be documented and maintained.
Standard libraries are important, even if they are only defacto standards because if you use a standard library, even if it's buggy, when an experienced programmer looks at your code, they know what the lib does (and hopefully its bugs), so they only need to read your code, they don't have spend time looking at the lib.
Whereas if you were writing your own custom libraries, or using one of 100 possible libraries (with no defacto standard) out on the internet, the person taking over or helping out will have to spend extra time trying to debug/understand it.
Some of the FP languages are catching up in terms of standard libraries, but for many you either have to write your own crap or have to waste lots of time finding and evaluating libraries to see if they are worth using.
Nobody in their right mind wants to deal with 10 different print commands (ok maybe the php people are different
I'd rather be able to get on with writing the more "interesting" bits ASAP.
Apologies (honest) for the sarcasm, but do really think that if the CPU vendors had any useful room left to increase sequential processing performance that they wouldn't use it? Are 3D layouts out of the research labs yet? Are production fabs for 3D chips feasible?
I.e. what basis is there to think CPU designers have any choice (for the mid-term) but to spend the transistor income from Moore's Law on parallel processing?
I use Friend/Foe + mod-point modifiers as a karma/reputation system.
With GBs and GBs of memory, there is no reason that common include files shouldn't get stored in cache. The writes are also miniscule, and contention there would also be a design error. Merging everything in one file is another thing, of course, but that mainly shows how expensive parsing (repeatedly, of include files) is, not the I/O itself.
When people say that they want to learn "functional programming", they usually mean some functional programming language. Common functional programming languages (OCAML, SML, Haskell, Lisp, Scheme, etc.) are duds when it comes to performance: in principle, they encourage parallel-friendly programming, in practice, they get many basic language design and performance issues wrong.
It's also silly to rewrite large amounts of code when usually only a small percentage of the code is performance critical. If you need to speed up C code to run fast on a multicore machine, optimize the inner loops. All the "functional programming" you need for that is available right within C in the form of OpenMP. You'll understand OpenMP a little better if you have used a "real" functional programming language, but the concepts aren't rocket science and you can do just fine as a C programmer.
Bull. An industry like software doesn't depend on talent alone. If all the engineering disciplines relied on talent, we'd be in the stone ages.
The real issue is that there are shitloads of people in the software business who either have grown up knowing nothing at all but the imperative paradigm or that know about FP but think it isn't really useful or think is just the stuff os masturbatory braniacs.
As to any claims somebody would make of being a supercoder, I would be highly skeptical. The software industry is plagued be delays and bugs. Security bugs in Linux and Windows have become the norm. People have grown up thinking it's as natural for a computer to get a "virus" as it is for you to catch a cold.
Many a good people have worked hard on theories and products to produce safer code, but they are largely ignored, except in industries where it is critical (military, aviation, etc.). The average developer is an ignoramus.
Main difference between the BSD license and the GPL license: one is from California and the other is from Massachusetts