Slashdot Mirror


Purely Functional Data Structures

andrew cooke writes "A while ago I read the comments following a Slashdot book review. Someone had posted a request for books that covered a wider range of languages than Java, C, Python, etc. Well, I thought, why not review Okasaki's Purely Functional Data Structures? It's a classic from the underworld of functional programming - recognised as the standard reference, yet clear enough to work as an introduction to the subject for anyone with a basic functional programming background. Of course, some readers won't know what functional programming is, or what is special about pure data structures. So I hope that this review can also serve as something of an introduction to the languages that I (a software engineer paid to work with Java, C, Python, etc) choose to use in my spare time, just for the joy of coding." Read on for the rest; even if you're not planning to give up C or Perl, there are links here worth exploring. Purely Functional Data Structures author Chris Okasaki pages 220 publisher Cambridge University Press rating 8/10 reviewer Andrew Cooke ISBN 0521663504 summary Functional programming for grown-ups.

In Okasaki's introduction he says that the "[...] benefits of functional languages are well known, but still the vast majority of programs are written in imperative languages such as C. This apparent contradiction is easily explained by the fact that functional languages have historically been slower than their more traditional cousins, but this gap is narrowing."

Indeed, OCaml has a reputation for being "as fast as C," yet it contains automatic memory management and supports object-oriented, as well as functional, programming. It's also probably the most widely used functional language outside academia (except perhaps Lisp/Scheme).

I mention OCaml not just because it's fast, free and popular, but because Okasaki uses a related language - ML - in his book. The ML family of languages are the standard strict, functional languages (Standard ML of New Jersey is perhaps the reference implementation but see also standardml.org). Okasaki also includes an appendix with examples in Haskell, which is the standard lazy functional language.

The difference between lazy and strict languages is the order in which code is evaluated. Most languages are strict. Unlike most languages, Haskell only evaluates something when it is absolutely necessary. Each parameter to a function, for example, is passed as a "thunk" of code, not a value. If the value is not required inside the function, the parameter is left unused; if it is required (say as part of a result that needs to be displayed) then the thunk is evaluated. This evaluation may trigger a whole slew of evaluations of functions that "should" have been called earlier (from a Java programmer's point of view).

Laziness is both good and bad. The bad side is obvious: the order in which code is executed my be very different from the order in which the program was written and some serious book-keeping is necessary in the compiler to juggle the thunks of code and their final values. The reordering of code could cause mayhem for IO operations, for example (in practice, of course, Haskell includes a solution to this problem).

The good side is that laziness can help make programs more efficient and, while the definition of ML doesn't include laziness, individual ML implementations -- including OCaml and SML/NJ -- include it as an extra.

Much of Purely Functional Data Structures (the second of three parts) focuses on how to use laziness to make data structures efficient. Lazy evaluation allows book-keeping actions to be postponed, for example, so that the cost of maintaining the data structure in an efficient form can be averaged across several read/write operations (improving worst case limits - avoiding a very slow response if the data happen to be in a "bad" order).

An understanding of how the efficiency of algorithms is rated (the big-O notation) is one piece of knowledge that this book does assume, along with a basic grasp of what Stacks, Queues, Trees, etc, are.

This lazy boost in efficiency is needed because, even though functional languages may be getting faster, it's not always possible for them to implement the efficient algorithms used in imperative (non-functional) programming.

But I'm getting ahead of myself, because I haven't described what a functional language is, or why it is useful. These are the topics of the first part of the book, which explains how functional languages, which make it impossible to change variable values by direct assignment, support persistent data structures. This section is fairly brief, and while it's a good refresher course for someone who's not had to worry about such things since studying at university, it's not sufficient as an initial introduction to functional programming in general.

There's a good explanation of functional programming in the Wikipedia, but, in all honesty, I don't see how anyone can really "get it" without writing functional code (just as I, at least, couldn't understand how OOP worked until I wrote code that used objects).

So forgive me for not telling you why functional programming is good (This paper is one famous attempt), but perhaps a better question to focus on is "Why should you spend the time to investigate this?" The best answer I can give is that it leads to a whole new way of thinking about programming. Describing functional programming as "excluding assignment to variables" doesn't do justice to the consequences of such a profound change (one I found almost unimaginable - how can you program without loop counters, for example?).

There's a practical side to all this too - learning new ways of thinking about programs makes you a better programmer. This ties in closely with the final part of Okasaki's book, which explores a few fairly esoteric approaches to data structures. Who would have thought that you can design data structures that parallel the way in which you represent numbers? Some of this is pretty heavy going - I can't say I understood it all, but I'm taking this book with me on holiday (it's slim - just over 200 pages) and I'll be bending my brain around some of the points in the last few chapters as I lie in the sun (here in the southern hemisphere it's late summer).

So just who would benefit from this book? It seems to me that it's most valuable as a second book on functional programming. There are a bunch of texts (and online guides) that can get you started in functional programming. This one goes further. It shows how to exploit the features of functional languages to solve real problems in applied computing. Admittedly, they are problems that have already been solved in imperative languages, but you might find that you, too, come to enjoy those famous benefits of functional languages. The algorithms in this book let you enjoy those benefits without paying the price of inefficiency.

Andrew Cooke last reviewed for Slashdot The Aardvark is Ready for War . You can purchase Purely Functional Data Structures from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

15 of 427 comments (clear)

  1. Over where I work by Anonymous Coward · · Score: 5, Funny

    Management prefers dysfunctional programming.

  2. I don't even see the code.. by laurent420 · · Score: 5, Funny

    blonde... brunette..

  3. Nice Review! by jaaron · · Score: 5, Informative

    Nice to see some actual content on Slashdot!

    By the way, this reminds me of the recent article on Domain Specific Languages over on Martin Foweler's website. Another aspect of programming worth investigating.

    --
    Who said Freedom was Fair?
  4. Not a typewriter by JoshuaDFranklin · · Score: 5, Funny
    But I'm getting ahead of myself, because I haven't described what a functional language is, or why it is useful. These are the topics of the first part of the book...

    You know, your computer is not a typewriter, so you really could have rewritten that part of your review...

    1. Re:Not a typewriter by ornil · · Score: 5, Funny

      He did, but because of lazy evaluation it ended up after the second part.

  5. I like any language by donnyspi · · Score: 5, Funny

    that supports GOTOs. Those are the best!

  6. Re:Purely *Functional* Data Structures by MooseByte · · Score: 5, Funny

    "As opposed to what, Data Structures that don't work? Yeah, we need more books on those..."

    Fear not, MFC is extensively documented.

    (Complains I, recently bit by yet another lame long-acknowledged-yet-unfixed bug in M$'s class libraries.)

  7. This was a great review by jkauzlar · · Score: 5, Interesting
    I've been learning OCaml on my own for the past several weeks and I've been wondering many of the things that this book seems to address, such as how the functional paradigm solves common problems differently than the imperative. I know first-class functions can significantly reduce the amount of code needed in many procedures...

    I guess what I'm saying is that I've used languages like Perl and Python considerably and ignored the functional aspects of the language, probably much to my disadvantage. I think a good study of a purely functional language could really improve my perl, python, or ruby.

    1. Re:This was a great review by SparafucileMan · · Score: 5, Insightful
      Actually perl has an alternate syntax that includes a functional language, and, of course, you can always write a functional language in perl. But most Perl code online (CPAN) isn't written like this, which pretty much defeats the purpose.

      For example, where I work, I have to write in ColdFusion sometimes. ColdFusion has 2 syntaxes: a tag-based one that looks like HTML and is four times redudant and impossible to deal with, but is easier for people, and a second syntax with first-order functions. Writing in the first-order function syntax is easier, more efficient, clearer, and easier to debug in everyway, except all my co-workers write in the tag syntax, which forces me to, as well. It sucks.

      The point is that programs tend to be written to the lower-common-demoninator of the language, which makes the difference between functional, procedural, and oo languages so huge when there is really no difference.

  8. Re:Functionals by Kupek · · Score: 5, Insightful

    I agree that functional programming languages are quite useful, but speaking as a coder who learned functional programming just last year in class, I can say that functional languages are a lot more complicated than procedural languages.

    I don't know how you can say that the languages are more complicated. Scheme, for example, is the simplest language I know of, in terms of syntax and semantics.

    I think you're confusing this with how difficult it is to think functionaly when you've been raised to think imperatively. Functional programming is a different paradigm than imperative programming, and as such, you have to think differently. If you've been programming imperatively for a while, learning another imperative langauge is often straight forward; you learn the basics of the syntax, what the language provides natively, and how you can construct what it does not provide. You already know how to solve problems with that style of language.

    Learning a functional language, however, is more than just having to learn a different syntax and set of rules (assuming you've been raised imperatively). You have to learn a different way of solving problems. And until you've done that for quite some time (I, for example, have not), I don't think you're qualified to make the judgement calls you did.

  9. Re:Purely *Functional* Data Structures by I_Love_Pocky! · · Score: 5, Insightful
    I never got a Comp-Sci degree myself

    Then how the heck do you know what the teachers are teaching? Maybe you have only talked to the morons who go through comp-sci thinking it is a programming degree. I know that there was a large percentage of people I went through school with who chose to ignore things like functional programming simply because they had the mind set that they were there to get a software development job after school. It was taught, but it mostly fell on deaf ears. If it wasn't c++/MFC they thought they wouldn't need it (and hence spent no time trying to learn it). They were the same crop of students who thought it was stupid that they had to take differential equations and physics, because they would never use that in "the field." They just wasted their education.

    I personally thought it was really interesting, and as such I am now a graduate student. I love comp-sci for what it actually is, which is not programming.

    Why am I replying to a troll??? Oh well, I feel better now.
  10. Online version available by johnnyb · · Score: 5, Informative

    Also, since this was this guy's thesis, it's also available online

    See http://www-2.cs.cmu.edu/~rwh/theses/okasaki.pdf.

    I suggest you get the book, however, as it's a great read.

  11. Re:Functionals by Anonymous Coward · · Score: 5, Insightful

    but speaking as a coder who learned functional programming just last year in class, I can say that functional languages are a lot more complicated than procedural languages.

    Wouldn't you be a little bit more qualified to comment if you had several years of experience with both functional and imperative languages, first?

  12. Re:Functionals by gangien · · Score: 5, Informative

    I can say that functional languages are a lot more complicated than procedural languages

    What?? More complicated? No no no.. They are certainly not more complicated. For instance the funcitonal languages:

    (display "Hello World") -Scheme

    main =
    do
    putStr "Hello World" - Haskell

    vs

    #include
    main () {
    printf("hello world"); }- C

    #!/usr/bin/perl
    print "hello world"; - perl

    Now those differences are subtle, but A. The functional languages are easier to read(and I think most any none biases person would aggree with me) B. No whacky syntax, hell scheme syntax is only () C. The data types are simple, Pointers? please. Ect ect. The thing that makes functional programming difficult is the lack of an imperative control flow. Which is how people tend to solve problems. For instance if you want to sum up the numbers 1 through 8, in an imperative language it'd be

    tol = 0
    for i = 1 to 8
    tol += i
    end for

    in scheme it'd be
    (define (sum a b)
    (if (equal? a b)
    b
    (+ a (sum (+ 1 a) b)))

    which is confusing. And not obvious. But as for other people's code, that is always the case. Well I've never seen a language that oculd get around that anyhow, the closest I think is java.

  13. Took Okasaki's data structures course by philgross · · Score: 5, Interesting
    Here's a reason to get a Computer Science degree at a good school: you can take a course on data structures taught by Chris Okasaki, the book's author. I took Advanced Data Structures from him at Columbia in 1999. Now he's at West Point.


    The course was pretty mind-blowing. He knows his stuff. It was a bit freaky to watch him grading programming assignments by just reading them, not running them, and yet never missing a mistake.


    I would recommend the book not just as an introduction to advanced data structures in functional languages, but as a guide to some of the more interesting data structures of the last fifteen years, regardless of implementation language.


    -- Phil Gross