Haskell 2010 Announced
paltemalte writes "Simon Marlow has posted an announcement of Haskell 2010, a new revision of the Haskell purely functional programming language. Good news for everyone interested in SMP and concurrency programming."
← Back to Stories (view on slashdot.org)
I admit that a function with no side effects can ease making things thread safe, but are recursive style functional programming languages really used that much in "SMP and concurrency programming" nowadays. I wrote some code in that category but I never envisioned a functional programming language would suit the job. Am I the only one ? ;-))
Thanks for your replies,
Everything I write is lies, read between the lines.
And all three of its users are overjoyed.
What the hell have these guys been doing for the past four years? Only now they are implementing DoAndIfThenElse/EmptyDataDeclarations
Is this the Slackware of the programming world?
"The difference between genius and stupidity is that genius has it's limits" - Albert Einstein
"The current committee is still discussing how to go about finding a new committee"
A fitting name for this Haskell programming language might have been "Python" hadn't it all ready been taken.
Some of my favourite people are from th US; Vonnegut, Chomsky, Bill Hicks.
So I don't think I'll look at the article until I actually need to program in Haskel....
Test your net with Netalyzr
I skimmed through the revisions, and I don't see how they relate to concurrency (caveat I'm a Haskell newbie). Maybe someone can enlighten me.
My UID is prime. Hah!
Haskell! Haskell! Every geek's favorite mental masturbation toy!
Please do not read this sig. Thank you.
Haskell can't compete with my implementation of the pure lambda calculus!
screw all that syntactic sugar, lambda expressions ftw!
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes
NoNPlusKPatterns... Seems that they're finally banned.
http://www.mail-archive.com/haskell@haskell.org/msg01261.html
I agree it's not always the best example, but there are four reasons why "qsort [1..10000]" is slow, and you can address three of them without writing something that's "100 times less readable than C".
The first problem is that naive quicksort on an already-sorted list/array is O(N^2). The fix is to grab an element out of the middle and use that as your pivot. That alone will make qsort far faster. (I tried it.*) The resulting code is, of course, a bit messier.
Secondly, the qsort function has no type signature, so it will default to the most polymorphic implementation, which is something like qsort :: Num a -> [a] -> [a]. Typing "qsort [1..10000] probably (though I'm not sure, it's a ghci-specific thing) defaults to using Integer, which is less efficient than a machine Int. If you constrain the type signature the code will be faster.
Thirdly, you're using ghci. Code compiled with "ghc -O2" is far faster. (Be sure not forget that -O2, it make the code about 10x faster.)
The fourth reason is that C is doing in-place array swaps. (You can do the same thing in Haskell using the ST monad if you really want to. I don't think the result will be "100 times less readable", but it will be fairly ugly compared to the pure version. There are a couple such implementation here.) Whether doing quicksort with lists is good enough really depends on what you're doing.
* Quicksort modified with a swap, then mangled by slashdot:
Closures are a mechanism that allows you to efficiently write a function that can produce infinitely many different functions as its output. Functional programming without closures would either be a big pain, or would involve compiling new functions at runtime all the time. There might even be things that you provably can't do without closures; in the lambda calculus, beta-substitution can substitute free variables inside a lambda abstraction with values from outside its scope, so it comes down to whether certain restrictions on substitution impair Turing completeness.
And once you have closures, you need garbage collection.
Are you adequate?
How many languages does it take to tell Mrs. Cleaver how lovely she looks today?
The revision to which I look forward is the hack that causes it to make very impolite comments to Mrs. Cleaver.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
A purely functional language can model a stateful computation by encoding the program as a sequence of state transformers. Basically, the state of a computation is explicitly represented in the program by actual values that you pass around as arguments to functions that return a new states. By chaining state transformer functions you can write an equivalent to any imperative program.
Thus, in principle, could encapsulate IO into a purely functional language simply by requiring that all IO functions take an abstract "state of the world" argument and return an altered "state of the world" value. These "state of the world" values don't have any functions that allow you to introspect into them; all they serve to do is to determine the order in which IO actions would be performed.
So for example, your purely functional language could have a print(str, sow) function would take as its arguments (a) a string to be printed, and (b) the state of the world before the string has been printed, and return (c) the state of the world after the string has been printed. If you want to print "Hello " before "world!", then you have to write code like this ("sow" means "state of the world," and is passed in as the argument to the main() function of the program):
In this pseudocode, "Hello " is printed before "world!" because the state of the world that results from the inner print() function call is passed in as an argument to the outer one.
The problem now is that unless you do something to forbid it, people can write impossible programs by using nonsensical state chains; e.g., by trying to use the same "state of the world" value twice as if you could reset the world to the state before you printed "Hello " and reperform this destructive action:
In this nonsense program, you're supposed to print both "boo!" and "Hello " as the very first thing, which is impossible. The main() function is also supposed to return the state after "boo!" was printed, which means that the caller of main() is allowed to print stuff before "world!". Yeah, it makes little sense.
Basically, you can think of monads as an abstraction that allows us to implement this kind of purely functional state-passing program, but encapsulating the state-of-the-world values so that the callers can't touch them, and thus prevent them from writing nonsensical programs. So strictly speaking, they allow the language to provide side-effecting operations but isolate them from the purely functional parts of the language. Any function that does IO must use the IO monad in its type, and purely functional code that can't extract any values or state from the IO monad. To perform any computation on values that depend from things you read from IO, for example, you can't pull out the state-dependent values out of the monadic context and pass them as arguments to your functions; you must, in effect, pass your functions to the IO monad so that its internal implementation can apply them to the quarantined, stateful values.
Are you adequate?
There are several reasons. As abigor stated, functional programming is not well understood by the majority of programmers. If you suppose that there is one Haskell programmer for every 1000 Java programmers, then that one Haskell programmer has to be one thousand times more productive in order to get noticed by the larger programming community. (I don't know if the distribution is that skewed, but I think you get the idea.) There are small companies that have been very successful at using FP, but they're easy to miss if you don't look in the right places.
Another reason is that the fastest functional languages are still slower than C in general. I don't think the difference is as much as you say, Haskell and Ocaml both have very good code generators. Do check out the "programming language shootout" results. (Be sure to use -O2 with ghc, though, or you really will see a 10x difference.) However, there are many programmers for whom if you give them a choice between one language and another, they will always choose the faster language regardless of other considerations. A lot of supercomputing people still use Fortran, for example.
A third reason is that software firms are very risk averse. If you have a problem that you know a team of average Java, C++, or C# programmers can solve in N months, versus a smaller team of more expensive functional programmers who may or may not be able to solve the problem in much less time in an unknown language that no one else seems to be willing to bet the farm on, then a lot of firms will see FP as an unacceptable risk.
Fourth, consider the sales pitch for Haskell: "Hey, check out this cool new language! It's a bit slower than C, and you can't change the value of any of the variables. It was designed by committee. Half of the people who use it got their PhD by hacking a type system extension onto ghc." The primary benefits of Haskell are hard to describe to people who haven't used it, namely that programs tend to have fewer bugs (thanks to the strict type checker), and it often takes much less code to solve a given problem. The drawbacks, however, are rather obvious up-front: how many programmers do you know who would welcome the idea of "everything is const"? There aren't really any Haskell programs that you couldn't write in C, or Java, or C++, or C#, or Python or any of dozens of other languages. The difference is, how much code are you going to write, and how much work is it going to take to make the program reliable?
I would consider Lua a functional language (and I think it's designers went out of their way to make it so), though it isn't exclusively functional; it is possible (and typical) to write imperative, stateful code in Lua as well. I haven't used it much, though, so it may be that there is some drawback to using a functional style that I haven't considered.
Let's see if I can explain this simply.
The Haskell language, like any other language, needed constructs like "read" and "write", but to implement them as simple functions would have broken the underlying assumptions of purity and lazy evaluation.
Haskell happened to have monads. A monad is essentially a typeclass for containers, that allow you to do certain things to combine containers of the same type, without having to worry about what kind of container it is. Most (all?) of the containers in the standard library are instances of Monad.
The Haskell language designers came up with (or perhaps borrowed) an idea. They would create a new container type, called IO, and make it an instance of Monad. However, unlike other containers, it would not have any accessor functions. You can pass around an object around of type IO in pure code all you want, but you can't ever examine the contents of the IO container from within "pure" code. The only thing you can do with it is combine it with other IO objects. Combining two IO objects is equivalent to evaluating the file operation or what have you inside one IO object and passing it's result to and executing whatever is inside the second IO object. The actions within an IO object, however, are free to invoke pure code if they like.
Every haskell program has a main() function, which is an IO action. This allows you do do any necessary file IO your program needs to do, and it can also call out to pure functions. Pure function cannot invoke IO actions. Most Haskell programmers try to keep the IO actions as simple as possible and rely on pure code for the bulk of the program.
As a concrete example, I wrote a ray tracer, which parsed a text file and generated an image. As I was writing it, I got to the part where I needed to write the file parser. I thought "oh, no, this whole thing has to execute within the IO monad and it'll be a big mess". However, it was not so. After scratching my head a bit, I ended up writing a pure function that takes a simple text string and converts it to my internal representation of a scene, ready to be ray traced. Within main (within the IO monad), I would read the text file in with a lazy function "hGetContents", which returns a string which is the contents of the file. I would pass that string to the parser, and then trace a grid of rays (one per pixel) against the parsed scene. The list of pixels with their calculated color values was returned to the IO monad, where I used OpenGL to plot them to the screen.
The interesting bit about this is that hGetContents is lazy. In a strict (i.e. non-lazy) implementation, the whole string would be read at once. This is inefficient, and may cause problems for text files that won't fit in memory. Due to laziness, however, the string is passed into the parser without being fully evaluated. As the parser needs more data, the run-time system will cause hGetContents to read another block. So, here we have an example of a pure function that's indirectly triggering IO, and it's doing it all without violating the constraints of the type system.
I think that experienced haskellers often forget to explain that there is a portion of the program that is not strictly functional. The thing is that the programmer is not given access to it. Instead, the programmer is asked to pass around descriptions of the I/O actions to be taken. A monad is a data type that (amongst a vast number of other things) can be used to structure these descriptions so that we can get the order of execution right. (notice that I didn't say 'evaluation')
The next part is a bit sloppy because monads turn out to be even more abstract than described, but it suffices to explain the concepts that give us the ability to be pure and still interact with the outside world.
Every Haskell program is an instance of a function that returns an IO Monad. "Inside" that monad (for the moment, think of it as a box plus a little bit of extra data) is a description of the I/O action to be performed and a new function that takes the result of that I/O (possibly discarding it) and produces another monad. Only the function inside the monad is allowed to refer to the result of performing the execution of that monad, but it is also able to refer to any functions outside of the monad. (Like lexical scoping.)
There's always a impure portion to a program that the programmer never gets to see. It's job is to evaluate just enough of a function to get ahold of an I/O monad, read the description inside that monad, perform the action and then repeat the whole process again by passing the result into the function if found inside the monad. This division of duties is enforced by only allowing the programmer to use the functions with stuff a description and function into a monad, but not the ones to get it out. Only the impure part of the function can
All this so far is interesting, but it seems like it would take an awful lot of discipline just for the sake of purity. However, what really make monads snazzy is that there are some great tricks with syntactic sugar that can help the programmer design these descriptions in much the same way he would write an imperative program. This is Haskell's 'do' syntax. The 'do' syntax doesn't make a purely function Haskell program imperative, but it sure makes it look a lot like it is.
Still, monads are nothing more than a data type with a couple of particular kinds of functions defined on it. In the case of I/O, those functions stuff things into the monad, combine monads and get information out of the monad. If you use monads for other things, it might be worthwhile to think of those functions as serving other purposes. Haskell's monad type class simply abstracts the features of all these so that algorithms used on one can often be used on all the others... even if it does obscure the original interpretation of what's going on.
"My religion is to live --and die-- without regret." -- Milarepa
The definition of purely functional is a bit more complicated than you think it is, but Haskell is pure. A Monad is just a mathematical structure and is no more a part of the language than groups, rings or fields are - you can still put them in your program though. The IO monad is a bit special, because it's depending on the real world, but it's still a monad. The programming verification you speak of is much easier than the corresponding things you can do in C, since you know that there will not be any null pointer dereferences and similar horror stories in your Haskell programs you can focus on verifying the interesting things.
Haskell does have side effects, just like any other useful programming language. However, you can't put side effects just anywhere in a Haskell program; you have to explicitly specify which functions have side effects.
This is done using the type system so that the return values of all functions that can have side effects must be IO x (ie. IO<x> in C++/Java notation). This way, you (and the compiler) can be sure that any function that DOESN'T have the type IO x is 100% side effect free, allowing easy parallelization, many types of optimizations, etc. In other words, all side effects are put in a separate "side effect bin".
Another big thing to understand is that Haskell makes it impossible (well, not really, but strongly discouraged) to return a value from IO back to the pure part of the program. Any computation that may depend on the result of some side effect is considered to have side effects of its own; removing a value from the "side effect bin" is a side effect. Essentially, side effect code can call both side effect code and pure code, but pure code can only call other pure code.
Since every useful program has at least some side effects (reading input, returning output), every Haskell program has a main function which has the type IO () (ie. has side effects, doesn't return anything). The main function can then call the rest of the program, just like in other programming languages.
In order to keep this type of programming from being a total pain in the ass to program with, Haskell uses monads (which are an unrelated concept, they are used for lots of other things as well) to make it easy to compose smaller IO functions into larger, more higher-level IO functions.
things can be passed from one function to another without naming them.
Isn't that a rather point-less feature? ;-)
Can it; I think not. It's incredibly cool to write qsort in just 2 lines of code, but it's incredibly slow.
Actually hGetContents cheats by using unsafeInterleaveIO. It's a cool hack, but lazy IO is an inherently risk business. For instance, if your program later wrote back changes to your scene file, the result is undefined. There's no guarantee of when the program is done reading the file. If you keep this in mind it's no problem, but if all IO worked this way you'd go insane. (Also consider the problem of error handling if there's a disk error.)