Mastering Algorithms with Perl
Mastering Algorithms with Perl is designed to provide the necessary background. It's structured like a traditional algorithms textbook: after describing some basic and advanced data structures (linked lists, trees, heaps, etc.), it has chapters about searching, sorting, sets, matrices, graphs, strings, and some related topics. After the introduction and discussion of data structures, the chapters are relatively independent and could be read in any order. The authors provide plenty of cross-references as well as pointers to books that describe individual subjects in more detail.
The intended audience is programmers who don't have a background in computer science, who know at least some Perl. However, experienced programmers who don't know Perl should have no trouble picking up the basics of the language with this book and a copy of Programming Perl. Also, computer scientists can often use a review of algorithms, and the CPAN pointers are very useful. So, I would go so far as to say that this book would enrich any programmer's bookshelf. A stringent test of the merit of a new technical book is to ask if it adds some value, given the best existing books in its area? I think that Mastering Algorithms with Perl definitely does. It is a well-written introduction to algorithms that is more accessible, practical, and entertaining than standard algorithm books. It leverages off of the strengths of a powerful language and a large base of reusable code.
The rest of this review will evaluate the strengths and weaknesses of Mastering Algorithms with Perl in more depth. The central issue that I will consider is why the reader might or might not prefer an algorithms book that concentrates on a single language, as opposed to a general algorithms book. I will try to be up-front about my biases: as a computer scientist, I consider this book to be a compromise between an algorithms book and a how-to manual. This compromise makes it much more useful to Perl programmers, but it sometimes causes the algorithms content to be too watered down.
It is traditional in algorithms books to describe algorithms in pseudocode, which often superficially resembles Pascal. The difference between pseudocode and real code is that pseudocode is not compilable - it ignores implementation details that are not helpful to understanding a particular example. This is considered to be an advantage: without the clutter, the core of the algorithm is easier to see and understand. At the beginning of the book the authors make the point that the Perl code for a binary search is actually shorter than the corresponding pseudocode. And it's true! The advantage of the Perl program is that we have a readable description of the algorithm, and it's executable too. (Unfortunately, it's often nontrivial to convert pseudocode into real source code - the devil is in the details.) The binary search example is slightly misleading, however, because in this case a native Perl data structure (the array) matches the semantics of the problem extremely well, leading to a clear and concise implementation. Later in the book, particularly in the chapter on graphs, we see examples where Perl's built-in data structures are less well suited to the problems. The executable Perl code for graph operations are much longer than the corresponding pseudocode, and are often so syntactically cluttered that they are difficult to read. Is this a flaw in the book or in Perl? No - it's a consequence of giving examples in runnable code instead of pseudocode. Is the tradeoff worth it? Probably, but it depends on what you're trying to get out of the book.
Another consequence of basing an algorithms book on a real language is that the authors can point readers to existing implementations of the algorithms, in CPAN. It's hard to overstate how big of a win this is. Perl is a powerful language to begin with, but it becomes far more powerful when programmers are able to take advantage of the large body of existing code modules. An unfortunate side effect of the fact that the book talks about specific versions of Perl and about specific CPAN packages is that this information will become outdated much more quickly than the algorithms will. Unless the Perl language and CPAN are exceptionally stable in the future, I would not expect most of this information to be valid for more than a few years - hopefully a new version of the book will be available before this one becomes too out of date.
Because the book provides executable code for the algorithms, it's possible to evaluate the performance of the example code (which is available at the O'Reilly site). The authors benchmark a number of the algorithms that they present, and compare the results. This is a nice change from the discussion of asymptotic running times found in traditional algorithm books, which generally ignore the constant factors that often make the difference between an algorithm being useful in practice or not.
The design and analysis of algorithms is a highly mathematical discipline. A sophisticated set of tools has been developed to evaluate the tradeoffs between various algorithms: How efficiently do they use memory and processor cycles? What is the best, average, and worst case running time of various operations? How does the algorithm scale as the size of the input grows? As it turns out, programmers need to understand a few of these formalisms, particularly the "big O" notation for describing asymptotic running time. I think that Mastering Algorithms with Perl uses theory in just the right way: as an aid to programmers' intuition about algorithms, rather than beating us over the head with formulae and proofs. That said, I think there is one area of theory that this book should have spent more time on: NP completeness. NP-complete problems are solvable, but are believed to be inherently hard: no efficient algorithm has been discovered to solve them. There are a wide variety of NP-complete problems, and they do come up in practice. For programmers, the important thing is first to recognize that an NP-complete problem has been encountered, and that it cannot be solved exactly except in small instances. Then, a heuristic that comes up with a good enough approximation of the solution needs to be found and implemented. This is a practical and well-studied part of algorithm design, and in a 650-page book I would expect more than a page or two to be devoted to it.
Several chapters of Mastering Algorithms with Perl are too shallow to be considered good introductions to the associated areas of algorithms. For example, the chapter on matrices only shows code for some of the more trivial matrix operations; for complex tasks, it tells the reader how to use PDL - the Perl Data Language. Although PDL looks like a useful and powerful package, readers should not confuse knowing how to use it with understanding matrix algorithms. In other words, the matrix chapter is too much of a how-to manual. Other chapters such as the ones on searching and sorting are excellent and avoid falling into this trap. Algorithms is a huge area, and it can't all be covered well in 650 pages. The later chapters are a lot of fun to read, but some of them should probably have been scrapped in favor of more depth in core areas.
In conclusion, this is a well-written, useful book. Viewed as a Perl book it's superb; it complements the strengths of Programming Perl and The Perl Cookbook, and I think most or all Perl programmers would benefit from having a copy. Viewed as a computer science book, it has made a number of compromises in order to focus on a specific language; this is not necessarily a problem but it is something that readers should be aware of.
Acknowledgments: Thanks to Tom Christiansen, Dave Coppit, Bill Pearson, and Jamie Raymond for helpful comments on previous drafts of this review.
Purchase this book at fatbrain.
So if we don't know Perl, or any other language (yet) then we're screwed in reading it?
Argh. Can anybody recommend a good basic book for those of us not from a Computer Science background (my degree is in theatre) but who are trying to get into programming, albiet slowly?
Listen to me Peter, I want this bench. You go sit on that bench over there, and if you're good I'll tell you the rest of
First, I'd like to say that this book is probably *not* for the beginner (as in, read one of the other Perl books first, and write in it for a while, I still have to read Learning Perl sometime...)
:)
Second, I'd like to ask why a good, pseudo-code, readable language *isn't* more popular nowadays. There are many books written like this (my Operating Systems text in school, and many of the examples, are written in something that looks like Pascal with support for multiple processes, although I've never seen such a beast) and their code is very readable.
I used to write in Turbo Pascal 7, and I enjoyed writing classes ("objects") with it. All my code was inherently pretty readable, even when I used nasty tricks, unlike my C code (or most Perl code that I've seen). Later, I did convert some of it to C and C++ with p2c and some hacking on my own, but it would be nice to maintain the readable version of the code.
---
pb Reply or e-mail rather than vaguely moderate.
pb Reply or e-mail; don't vaguely moderate.
provides very little that you can't get in a basic data structures class. I finished the entire book at a swim meet one day because there really wasn't anything that revolutionary behind it. Aside from pseudo-hashes, which look pretty nifty, I don't see much value in investing in this book if you already know data structures/sorting algorithms. I suppose if you didn't have a basis in those it would be a wise investment but even then most of the concepts are fairly simple and fairly intuative. My advice would be to buy an office copy and keep it around as a reference for your programmers if they have a brain freeze, but don't bother with a personal copy. .agrippa.
I would recommend Learning Perl also by O'Reilly & Associates. When I first started learning Perl, this book was invaluable. Perl was also, for me anyway, a good first programming language to learn on. Prior to that, the only other language I had learned was BASIC, and only very little. Most people will recommend not starting with a language like C, due to its complexity. Perl seems to be a good starter.
I now use Programming Perl and Perl Cookbook. Both are from O'Reilly & Associates. This book also sounds like a good reference to have once you have the basics down.
Ryan
P.S. My major was also theater (design).
I have this book and have skimmed through it. This is basically computer science algorithms (zillion of different sorts, graph representation, etc.) implemented in Perl. The problem is that this book has limited relevance to real life. "Perl Cookbook" is much, much more useful in this regard.
So, if you are collecting all Perl book, or are really interested in how to implement red-black trees in Perl, do buy it. On the other hand, if you are looking for snippets of code to solve small-to-medium-sized problems you meet all the time, "Perl Cookbook" is a much better choice.
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
Can anybody recommend a good basic book for those of us not from a Computer Science background (my degree is in theatre) but who are trying to get into programming, albiet slowly?
You don't necessarily need a book. Although I like having a peice of paper to reference (call me old-fashioned), there are many resources you can access for free on the internet (and print out if you're like me). Just hop over to http://www.google.com and enter the words "Perl Tutorial" for lots of good links.
When will Windows be ready for the desktop?
I would disagree that perl is a good beginners language. It has way to many hacks and gimmics in it. (This is not always a bad thing. Just that it lets beginners get away with too many things. At one point I inherited some horrid perl code that a decent grounding in CS would have helped immeasurably.) I'd suggest a cleaner, strongly typed language like python or java. Both have active development groups, and there's lots of documentation and examples out there. And ORA has books for both languages.
Zapman
There is one part of the review which doesn't make much sense
That said, I think there is one area of theory that this book should have spent more time on: NP completeness. NP-complete problems are solvable, but are believed to be inherently hard: no efficient algorithm has been discovered to solve them. For programmers, the important thing is first to recognize that an NP-complete problem has been encountered, and that it cannot be solved exactly except in small instances.
A big part of this is simply wrong. Problems are NP-complete if it is proven that there is no algorithm which solves it in polynomial time. What he describes is NP-hard. His basic requirement, finding out if a problem is NP-hard needs some literature research. To recognize an NP-complete problem one can read about it if it is known. If no literature is found, showing that the problem is NP-complete will take skills on highly advanced levels.
RSA crypto is a good examples: It relies on factorization being NP which has not been proven. It uses so calld strong primes to avoid polynomial factorization algorithms.
If it's Perl you're looking to get into, the Great O'Reilly offers up a number of books, including Learning Perl, Programming Perl, Advanced Perl Programming, the Perl Cookbook, etc. Start out with Learning Perl. Some other posts mention Python, which is also good for CGI, and you can pick up O'Reilly's Learning Python and Programming Python. Be forewarned, though. I've used both for CGI programming. And when I'm using Python (powerful though it is), I find myself longing for the regexps of Perl.
If you'd like an online tutorial, you might want to check out The CGI Resource Index, which is made by the same guy as Matt's Script Archive. Between the tutorials on the Resource Index, looking at the source of Matt's script, and reading the O'Reilly books, you can learn just about anything you want to know about Perl.
Of course, if you get stuck, you can always go to ng's, irc, or your local Perl nut.
Chapter 10, 'Geometric Algorithms' is available in PDF format, here.
--Kevin
=-=-=-=-=-=
"I think the P-Funk Mothership just landed in my back yard!"
Mastering Perl Device Drivers? :-)
I have some experience developing server applications in Perl. In short, as the app got more complex, I found Perl increasingly more difficult to use. The OO model seemed like a big hack, not unlike the rest of the language. It's very unreadable to me. Readable Perl can be written with some effort. More often than not, its overly flexible syntax leads to a lot of terse or confusing code, especially when non Perl experts try to write it. e.g. everything is done with regular expressions Don't get me wrong -- when I learned Perl, it was the coolest thing. It was great for quick hacks, very powerful. But as an application language it just seems way too loose. Yeah -w, use strict, etc. It was just not designed from the ground up to deal with modules very well. Java's package system is much better. I would much rather use Java (or Python or Smalltalk) for anything other than utility scripts (which Perl is really good for). I notice that even a lot of web sites are moving away from Perl CGI scripts to use JSP and so on. I suspect they found the same thing I did - Perl isn't a very good language for developing _large_ maintainable apps. So what does this have do with with the review? Well, I'm trying to figure out why someone would be studying algorithms in Perl. I guess O'Reilly has to capitalize on the Perl wave and suck every last buck they can from it. I suppose Perl's expressiveness makes it okay for expressing algorithms. Indeed it is easy to do some complex things (though data structures are a big hack too). I think Perl has seen its day, and will quickly be relegated to a system utility scripting language.
Problems are NP-complete if it is proven that there is no algorithm which solves it in polynomial time.
...factorization being NP which has not been proven
Wrong. Problem A is NP-complete if there are no problems in the set NP that are harder than A.
Furthermore, no one has yet proved that NP problems cannot be solved in polynomial time, although it is widely suspected this is true.
What he describes is NP-hard.
The classes *I* took used "NP-complete", "NP-hard" and "hard for NP" synonymously.
Again, just plain wrong. Prime Factorization is known to be NP (NP-complete, in fact). What is NOT known is whether PF can be solved in polynomial time.
It sounds very much like you picked up your Computation Theory knowledge by reading posts on Slashdot. I don't recommend that for people who enjoy flaming.
---
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)
In my opion this some book's algorithms are on the useless. Linked lists, while nessecary in some languages, are incredibly useless in perl. Hashes and dynamicly sized arrays eliminate the need for this. Sorting is covered, but perl has builtin sorting, and builtins are always faster than anything you'll write in perl. This is only the first 150 pages of the book. Those pages also include things like heaps and binary trees.
I have this book in my library, but don't think I'd buy it again if I had to rebuild it. It's one of the few perl books that collects any dust on my self.
Python is also conceptually simple, but it stresses operator overloading over object re-use and inheritance trees are typically pretty shallow. Lack of types make for code that's quick to write, not too hard to read, but you have to check them well because your arguments could be anything.
Maybe there should be an "Algorithms in Python" book. The language is ideal for beginners - even moreso than Java because of built-in lists and hashes.
Just my minor nit about typing :)
Ita erat quando hic adveni.
Seriously - read the subject. OOPerl is an amazing book - perhaps the only perl book you'll need. It's very concise, very clear, and the code you end up producing is clean code, not unlike Java or Python code.
Really though you're trolling. Sure there are things wrong with perl (like the fact that the second argument to bless isn't mandatory), but you can create crap code in any language. If you have experience of building a large app in perl and it all went horribly wrong because it got too large - you only have yourself to blame.
Matt. Want XML + Apache + Stylesheets? Get AxKit.
I picked up Lutz & Ascher's Learning Python just over a week ago, and I already love it. One of the great things about Python as a learning language is its "interactive" mode; run the interpreter without a program file ("module" in Python terminology), and you get an interactive prompt. Just start typing in statements; you can define functions, set variables, load modules, etc. Anything with a return value displays that value. It's a lot easier for experimenting with different statements than the usual cycle of edit a file, run it, edit it again, run it again, etc.
I've seen Python described as "pseudocode that runs", and so far, I have to agree; it pushes you toward writing more readable code. Yeah, yeah, yeah, "you can write readable or unreadable code in any language", and that's true. But where Perl seems to require additional effort to write readable code, Python seems to require extra effort to write unreadable code.
Weblogging Considered Harmful:
Introduction to Algorithms, Cormen, Leiserson and Rivest (yes, Ron Rivest, the R in RSA).
Far more mathematically rigorous than the O'Reilly book (from what I read of the O'Reilly book in the bookstore - I didn't buy it because it didn't look like much I didn't already have). No actual code, just pseudocode. I think this is the book you want if you really want to learn about algorithms (but not if you just want to get stuff done in Perl).
It's expensive, but it's a tome (>1000 pages). It was a good class textbook and still makes a very good reference. Check out the Table of Contents.
/* The beatings will continue until morale improves. */
Apologies for the topic drift, back to the regular discussion.
-----
The real meaning of the GNU GPL:
The real meaning of the GNU GPL:
"The Source will be with you... Always."
I think that this was the best Perl book I have ever read (Programming Perl and all the other ORA Perl books coming in close, but then again I've only read Perl books by ORA so Ihave to broaden my horizons), but it is a first printing of it. It has a lot of major errors. I highly reccomend that you visit the author's errata page before reading it and fix at least all of the major errors (there are a few). The author's errata page is more complete and will be updated as errors are found more than I would expect the ORA site is, so I reccomend that you use his site instead of ORA's site for errata concerning this book. Don't make the list of errata scare you though it is a great read and I myself am halfway through reading it in full for the third time.
If you think you know what the hell is going on you're probably full of shit.
If you think you know what the hell is going on you're probably full of shit.
jdube is who I am
Ok, this comment has a lot of inaccuracies. Let me try to clarify.
NP stands for "nondeterministic polynomial", and is probably most easily understood as the class of problems for which the solution can be verified in polynomial time. It includes P, the class of problems that can be solved in polynomial time.
A nice example of a problem that is in NP but not known to be in P is satisfiability. This problem is given as a list of predicates of the form (x1 or x2 or not x3). The problem is finding a set of xi such that all of the predicates are satisfied.
So, it should be obvious that you can verify a solution in polynomial time - just start with the values of xi and check that all the predicates turn out true. However, there is no known general technique for solving this problem than enumerating all the possibilities, which takes exponential time.
NP-completeness takes this idea one step further. It is a large an interesting class of problems that are basically equivalent in difficulty. If you solve one, you've solved them all. Thus, if a problem is NP-complete, there's no known efficient algorithm.
The way people analyze NP completeness is do define reductions, ie show how instances of one problem can be reduced into instances of another NP-complete problem, and vice versa. Maybe this takes "highly advanced skills," but it's actually fairly routine for algorithmicists.
The class of NP-complete problems includes the travelling salesman problem, the Hamiltonian path problem, the knapsack problem, determining collisions of 3D objects, and many others.
NP-hardness is when you can reduce an NP-complete problem into the NP-hard problem, but not necessarily vice versa. Many integer optimization problems are NP-hard.
Factoring is clearly NP, but is not known to be NP-hard. It's entirely plausible that someone (Arjen Lenstra, for example) will come up with a polynomial factoring algorithm, but leave the rest
of the NP pantheon untouched.
There are some crypto algorithms that are based on NP-completeness, but NP-completeness is not in and of itself enough for strong crypto. Even if the problem is hard in general, a specific instance may be easy to solve. Unless you can prove that this never happens, you're hosed. IBM has done some excellent work in this direction with randomized self-reductions for their lattice-based crypto algorithm.
Complexity theory is one of the most beautiful areas in computer science, and NP-completeness is one of the most striking results, as it illuminates a fundamental unity across many seemingly disparate subfields of computer science. It is indeed a shame that this book skimps on its coverage of NP-completeness.
LILO boot: linux init=/usr/bin/emacs
No, I don't have a reference handy. But I could swear we did this in a senior level computation theory class about 4 years ago.
I'm not angling to repeat the nightmares that were my attempts to reduce problems to other NP-complete problems, but it seems pretty obvious that PF is in NP. I can definitely verify a solution in polynomial time. All that would remain is to show that the number of primes increases at the same rate as the number of possible solutions to a SAT problem. Now that I think about it, I don't think the number of primes grows exponentially, so maybe PF isn't NP-complete. Huh.
---
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)
The best introductory Java book I've seen is Beginning Java from Wrox Press (can't remember the author). Don't know how good it would be for a rank newbie, but every other Java book I've seen has been even more newbie-unfriendly. Bruce Eckel's Thinking in Java might be better for an experienced programmer trying to pick up a new language, but might cause a porr non-programmer's head to explode.
Weblogging Considered Harmful:
He was a really nice guy, and he told quite a few interesting stories.
- Computer Algorithms: Introduction to Design and Analysis 3rd Ed
by Baase and Gelder. I just took my intro to algo course in which this was used as a text; it is very much readable and the examples really do illustrate the principles they claim to. It spans every thing from a base level introduction to NP-complete.Banfield
Oops. I had the Dylan programming language mixed up with some other Apple product code-named "Sagan". Carl Sagan sued Apple over the name, and lost. Apple changed the development name anyway, to "BHA". When word got out that it stood for "Butt-Head Astronomer", Carl sued again, and lost again. Later, when Apple began work on its Dylan (dynamic language) programming language, Bob Dylan sued. I don't know how (or if) the suit was resolved, but I don't think Apple has had to rename it to "BHM". Yet.
Weblogging Considered Harmful:
Others have responded that Matt's scripts aren't really a good source of information on how to program perl idiomatically, or maybe even correctly. :-)
But to add something to this thread, I'd like to point out that you really, truly can learn much more about Perl than you'd ever get from Matt's Script Archive by going to the Perl home page at www.perl.com
After you've spent a couple of weeks digesting that, it is possible that you would want to know even more, so you can go to Tom Christiansen's Far More Than Everything You've Ever Wanted to Know About... web page, if you haven't already. Oh yeah, and you can buy O'Reilly books, too. :-)
Babar
I'm surprised you made a claim like this without backing it up...
R is definitely for Rivest. Check the "What is RSA?" section of RSA's cryptography FAQ. I quote directly:
RSA is a public-key cryptosystem that offers both encryption and digital signatures (authentication). Ron Rivest, Adi Shamir, and Leonard Adleman developed RSA in 1977 [RSA78]; RSA stands for the first letter in each of its inventors' last names.
/* The beatings will continue until morale improves. */
Without seeing the code for those important parts, I can't say for sure, but given the rest of the non-industrial strength code, one can easily imagine the worst.
That would read better like this:
Because otherwise you have too much needlessly duplicated information.Better yet, you should just split into my ($key, $value) and loop across those in the same way. The anonymous array just seems to hurt legibility.
That should be far more than enough nits and gnats to keep you thinking for a while.
Absence of evidence does not imply evidence of absence.Here's my suggestion: read the CGI.pm source very, very carefully. There's a lot to learn. Good luck. Hopefully, you'll repent of your hackish ways that help give Perl a bad name by spreading bad CGI code around the world.
I shan't be tilting at any straw windmills.I shall, however be patiently awaiting your public apology and contrition.
As for the matters of strict, your disdain for robust coding should scare the hell out of any employer or co-worker. And symbolic reference are 99% of the time used because someone had no clue how to build a proper data structure.
You have a long ways to go yet in becoming a competent software engineer. And you don't seem to be on that path right now.
There's much more to CGI than you seem to realize. For example, do you even understand the critical semantic difference between GET and POST? They're hardly interchangeable.
If I should ever write a book on low-level CGI internals (which I hope to avoid), I shall be sure to include all this. Meanwhile, you should abandon your attempts at wheel re-implementation, because you're doing it in a dangerously cavalier and often completely wrong fashion. I stand by my work: the module is going to get things right that 98% of the script kiddies will never even understand. So use it.
Pay attention. You are dangerous. This is the horribly stupid perl code that people trash the language about, and you're part of the cause!
The Perl books cover, surprisingly enough, Perl. Are you asking for a good book to learn about HTTP, CGI, and HTML from?
This is part of being a glue language. We glue to everything. We can only teach you how to attach the glue. You need to understand what you're gluing to.