Review: An Introduction to Genetic Algorithms
Background
It was in the early nineties when I became interested in these sorts of things. I was enrolled in a commerce program, but somehow got onto reading such popular science books as Levy's Artificial Life: A Report from the Frontier Where Computers Meet Biology and Waldrop's Complexity: The Emerging Science at the Edge of Order and Chaos.
Those books made me make my next degree a computer science degree.
Emergent Computation
I was fascinated by the idea of computation emerging from the bottom up: from simple rules acting together in simple ways. This is in marked contrast to the traditional artificial intelligence view that complex behaviour typically only arises from the top down: from the complex interactions of complex rules.
I could appreciate the uses of traditional AI techniques, but emergent computation seemed somehow right to me.
Genetic Algorithms
Notwithstanding my simplistic explanation, there's an obvious example right in front of us. Evolution is a relatively simple process (everyone's heard of Darwin, right?) that has produced very complex output (e.g. us). What if we could harness the power of this evolutionary computation?
John Holland had the idea of mimicking this process of evolution within the computer. He encoded potential solutions as strings of zeroes and ones (the language of the computer), much as human genotypes consist of DNA strands. He developed these strings into actual solutions and measured their success against a particular problem, much as we might measure our success in life. Then he bred another generation, selecting the best individuals to reproduce ("survival of the fittest"), and applying crossover ("sex") and mutation on the strings for good measure.
That's another simplistic explanation, but as time went on these strings got better and better at solving the problem. And it didn't matter which problem. The same process could be used on almost any problem. He called this process a genetic algorithm ("GA").
An Introduction
This book is a good introduction to that world. The first three chapters present an overview of the field, and illustrate how GAs can be applied both in practical problem solving and in more theoretical research environments.
The author explains some of the history and background of GAs, the biological terminology, and its equivalent GA terminology. She provides examples and uses figures effectively.
The entire book has an "overview" feel to it. It isn't very long, and aims for breadth rather than depth. Mitchell writes with clarity, and is great at explaining the subject matter. It's not a difficult book to read.
Theory and Practice
The next two chapters cover the theory and practice of genetic algorithms. Chapter 4 is the most difficult, as it covers Holland's Schema Theorem and other mathematical and statistical observations. Fortunately, you don't lose much if you gloss over the equations, and they're there if you're into that sort of thing.
Chapter 5 is the fun stuff. Mitchell doesn't provide code, but that's okay because the explanation is lucid and sufficiently detailed to implement in code. She discusses ways of encoding the problem, implementing selection methods and the various genetic operators, and setting the parameters of the GA.
To test this theory and practice, each chapter concludes with thought exercises and computer exercises.
Applicability
Dating from 1996, the book benefits from being relatively up-to-date. It borrows from papers and studies up until then, which you'll recognize if you've browsed through other literature (such as the Santa Fe Institute's Artificial Life Proceedings).
Mitchell does reserve a critical view. She's careful to point out where further study is required, and that's important as this remains a maturing area of computer science. She also points out promising avenues for future study.
Summary
I found this book to be an excellent introduction to the field, even though I'd read articles and papers on GAs beforehand. I'd recommend it to anyone interested in genetic algorithms and ready to go beyond the popular science descriptions, but not yet ready for the hardcore books and not willing to waste time on the poorer quality "GAs made E-Z" books.
Basically, this is an excellent quality "GAs in a Nutshell" book. When you've finished it, you might be interested in Goldberg's Genetic Algorithms in Search, Optimization, and Machine Learning.
The book's official site contains a more detailed table of contents, while Mitchell's book page contains solutions to selected thought exercises, an expanded index, and errata.
You can purchase this book at Amazon.
TABLE OF CONTENTS
Preface
Acknowledgements
1. Genetic Algorithms: An Overview
2. Genetic Algorithms in Problem Solving
3. Genetic Algorithms in Scientific Models
4. Theoretical Foundations of Genetic Algorithms
5. Implementing a Genetic Algorithm
6. Conclusions and Future Directions
Appendix A: Selected General References
Appendix B: Other Resources
Bibliography
Index
One thing I find sorely lacking in many books on algorithms is any discussion of why you would select one over another
Do you mean GA versus say a Newton search method? GA is sometimes referred to as a method of last resort. This may be unfair, because many practical problems are not mathematically "nice". I am just getting into GA and I have very complicated simulations underlying my objective functions. We previously computed derivatives for these; it was a huge effort both computationally and for the programmer. One thing that I like about GA is that wrapping the optimizer around an arbitrarily complex objective function is really easy. Also, the parallelism is really good ("embarassing"), especially for distributed computing with message-passing (think beowulf).
For me, the bad thing is that convergence isn't nice and quadratic like some derivative based methods out there. On the other hand, quadratic convergence generally works only near the optimum and derivative based optimizers really only find local minimums (no guarantees about the optimum being globally optimal). Derivative based methods can blow up if you pick a bad guess objective too. Perhaps a good strategy is a combination -- use GA to get into the neighborhood of the global optimum and then use derivative based methods to find it.
I should stress that all this is for my particular application (groundwater). YMMV. Others with different objectives living in differently constrained control spaces will have different experiences. Also, to be fair I should point out that programs like ADIFOR make derivative computations easy to program.
Some helpful optimization links:
Decision Tree for Optimization Software
GA Archives
Comments like this are unfit and have a low probability of being allowed to reproduce. As more and more highly fit comments come along the selection pressure will eliminate such comments.
(sigh) if only it were true.
Ok how long till we have a genetic slashdot?
Unfortunately I have a profound mistrust of programs that I don't understand/can't maintain. While there are some areas where GA's can be very nicely applied, mainstream programming most likely won't get to use them much.
I'm more curious about their use in creating faster/cheaper hardware (but that's probably because I'm a software guy).
Unbreakable toys can be used to break other toys.
One thing I find sorely lacking in many books on algorithms is any discussion of why you would select one over another. If you want to see a shining example of what I'm wanting, take a look at Knuth's _The Art of Computer Programming_ and see how he tests competing algorithms to compare their costs and benefits in actual use. I hope there's something in this book about how one would decide whether or not to use a genetic algorithm.
Zbignew Michalewicz
Genetic Algorithms + Data Structures = Evolution Programs.
2nd ed., Springer-Verlag, 1994
Comes with useful source code. Is not afraid of equations, but is mostly practical (in this case that stuff works better than the other one).
I have run programs based on code from this book and they worked (didn't help me much 'cause the whole thing is too computationally expensive for my purposes, but that's another story entirely...)
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
In my wanderings around the Web I haven't found a better resource than The Genetic Programming Notebook. It is by far the most comprehensive site I've found. Here you will find everything. The three main topics are:
- Genetic Programming
- Genetic Algorithms
- Artificial Intelligence, including A. Life.
Within each section are tutorials, FAQs, links, bibliographies, papers, journals and generally everything you need to get started and become a GP, GA or A. Life pro.These are very interesting fields which are really coming into their own with the advent of Linux and free software.
Experimenting with these techniques is as easy as downloading an existing library. One of the best is GALib, a library of C++ Genetic Algorithm objects.
Have fun.
As multiprocessing becomes cheaper (thanks in part to Linux), GA's should find more uses. It is very, very easy to parallelize a GA. (You don't need Beowulf to do it.) See September's Linux Journal for a case of a GA designed only to solve a particular mathematical problem concerning White Dwarf stars. No Beowulf here, just a customized, 33-node Genetic Algorithm machine. Cost? A mere $22,000. Not too shabby.