Parsing Algorithms and Resources?
Derek Williams asks: "I'm a senior majoring in computer engineering & computer science and I've been programming for about 7 years, mainly in C and Java. While I've had quite a few courses that delve into some of the deeper topics of programming (e.g. Object Oriented Design), I find that the majority of programs I write, both for work and elsewhere, involve parsing. Although I have no problem tackling these sorts of programs, I was wondering if there was some branch of computer science dedicated to the study of parsing. What books and websites out there are of interest to someone looking to learn more about parsing and algorithms relating to it?"
Parsing Techniques - A Practical Guide
Flexible Parsing
Workshop on The Evaluation of Parsing Systems
Robust Parsing
Parsing Resources
Probably the last one on that list would be the most useful starting place...
You've got a couple choices -- finding yourself a good regular expression library seems like a good start ;-) If you're looking to do something a little more interesting than just lexical analysis, check out the red dragon book (better known as Compilers: Principles, Techniques, and Tools by Aho, Sethi & Ullman. I used it in my compiler course and I can tell you that they hit all the various parsing techniques (recursive descent, LA, LALR, SLR, etc.) very well, along with some other stuff. They concentrate on Lex/Yacc as tools -- you may prefer to check out ANTLR -- Terrence Parr's parser generator. It can be targeted at a bunch of languages and can also produce tree walkers for when it comes time to use your parsed data.