Slashdot Mirror


Text Processing in Python

Ursus Maximus writes "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book." Ursus Maximus's review continues below. Text Processing in Python author David Mertz pages 520 publisher Addison Wesley rating 10 reviewer Ursus Maximus ISBN 0321112547 summary How to use Python to process text.

As you probably know, there are many good introductory texts about Python. This is not one of them, for this is an advanced book, but not an inaccessible one. David Mertz has a unique style and focus that we have become familiar with from his series of articles on the IBM Developer Network. Dr. Mertz is more interested in facilitating our learning process than in lecturing us, and rather than fill his pages with impressive examples designed to illustrate his expertise, he gently guides us by offering subtle yet important examples of code and analysis that makes us think for ourselves.

He has a special talent for programming in the functional style, and this is a great introduction to that style of Python programming. Thus, this is also a good guide to using the newer features introduced into Python in the last few revisions, which often facilitate the functional style of programming.

The text includes, in an appendix, a 40 page tutorial covering the basic Python language. This tutorial is, like the book, unique in its approach and is worthwhile even for experienced Pythonistas, as it sheds light on some of the underlying ideas behind the syntax and semantics, and it also illustrates the functional style of programming, which is sometimes quite useful when doing text processing. And, despite its many other virtues, this is a book about text processing.

Chapter 1 covers the Python basics, but with a particular eye towards those features most critical and useful for text processing. Chapter 2 covers the basic string operations as found in the string module and the newer built-in string functions. Chapter three is about Regular Expressions, and, although I am shy about regexes because of their relative complexity, I am very glad to have read this chapter and will no longer be intimidated when regexes are the correct approach to take! Chapter 4 is on Parsers and State machines, which are important for processing nested text, as in everyday HTML, XML and the like. This chapter is not as esoteric as its title may sound to relative newbies (like myself), as it does offer useful ideas and principles for dealing with HTML. How much more useful can a topic be than that? It is true that a deep understanding of this subject may be beyond myself and other relative duffers, but this chapter has much to offer those like me and I am sure much more to offer professionals.

Chapter 5 is on Internet tools and techniques, and this a good example of how text processing touches every important area of computer programming. We manipulate text for email, newsgroups, CGI programs, HTML and many other aspects of net programming. A good summary of XML programming is included, as well as useful synopses of other Python internet modules, from a text processing point of view.

Appendix A is the aforementioned selective and short review of Python basics. Appendix B is a ten page Data Compression primer that is quite educational. Appendix C offers the same good service for Unicode, and Appendix D covers the author's own software, a state machine for adding markup to text, which is backed up by his extensive web site that has a lot of free software to support those doing extensive text processing. Lastly, Appendix E is a Glossary for technical terms from the book. This is very much an educational book, and would be suitable for classroom work at the University level, beyond the introductory programming level; in fact, as part of a curriculum to teach programming using Python at the University level, this would be an excellent text for the second course.

One of the highlights of the book is that each chapter is concluded with a problem and discussion section. These are of the highest quality I have encountered in computer texts. Rather than overwhelming the reader with a large number of problems, the author has obviously given a lifetime of thought in coming up with a few key problems that are meant to stimulate thought, creativity, and ultimately understanding and growth in the reader. I will be coming back to the problems often, as they cannot be absorbed quickly anyway; they require thought. These would be most useful in a classroom environment; but as they are accompanied by excellent discussion material, and backed up by the author's web site, the individual reader will be well served also.

The book is more than the sum of its parts. It will be a most useful reference source for when I am doing various text related tasks for some time to come, and it was also a delightful and educational quick read in the here and now. It also amply illustrates the centrality of text processing in all areas of computer science, and I am confident that the book will be useful and educational for all programmers, whatever their area of expertise.

To sum it all up, this book is educational. It is also beautifully bound and printed, and excellently written. I rate it five stars, my highest rating, and heartily recommend its purchase.

You can purchase Text Processing with Python from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

14 of 215 comments (clear)

  1. Bing! by FortKnox · · Score: 1, Funny

    Ah, I see you reviewed the book that goes BING!

    --
    Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
  2. You will benefit from this book.... by revery · · Score: 5, Funny

    "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book."

    If you are a practitioner of voodoo and merely handle large pythons, you will benefit from this book.

    If you are a undersea explorer but have heard of pythons....

    --

    Was it the sheep climbing onto the altar, or the cattle lowing to be slain,
    or the Son of God hanging dead and bloodied on a cross that told me this was a world condemned, but loved and bought with blood.

  3. Re:But why... by Slack0ff · · Score: 1, Funny

    c# what are you talking about. Im hard core Visual Basic 6 my friend. Text Process that!

    --
    Everyday You see me is the worst day of my life -Office Space
  4. Re:The book in full by jellomizer · · Score: 5, Funny

    Why do I have a sense of fear whenever I see a link that starts with g and ends with .cx

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
  5. I can think of one person... by Motherfucking+Shit · · Score: 3, Funny
    Exactly who wouldn't benefit from reading this book?
    Dr. David Mertz, probably... :)
    --
    "BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
  6. Re:But why... by tuffy · · Score: 2, Funny
    the other question is why use C# or Python for Text Processing while there is Perl !

    Why? Because then one would have to program in and maintain Perl code.

    --

    Ita erat quando hic adveni.

  7. benefits by tmark · · Score: 4, Funny

    If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book

    And if you're the website posting this glowing review, and collecting affiliate fees, you will also benefit from this book.

  8. Re:Great Intro by L.+VeGas · · Score: 3, Funny

    I read that intro about five times to figure out what he was saying. Basically, if you want to learn Python, you will benefit from this book.

    or....

    This book is good. (Python is implied)

    There you go, I distilled the whole intro into four words.

    Or even better yet: Good book.

  9. to simplify by MasTRE · · Score: 2, Funny

    "If you have read an introductory book or two about Python programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book."
    = No matter what, you will benefit from this book.

    Do I hear a "best thing since sliced bread" coming?

    --
    Must-not-watch TV!
  10. Re:The book in full by HaloZero · · Score: 2, Funny

    It's called rectaphobia.

    --
    Informatus Technologicus
  11. Text processing in Python by jdavidb · · Score: 4, Funny

    A good programmer can write Perl in any language. :)

    (Just kidding. ;) )

    1. Re:Text processing in Python by JurgenThor · · Score: 1, Funny

      More so: A good one can, a bad one will

      --
      GENERAL PUBLIC SIGNATURE (GPS) Any replies (derivatives) of this post must also use the GPS
  12. Re:What do you use python for? by axxackall · · Score: 2, Funny
    Took me a hot hour to figure these out.

    It took me 5-10 seconds to understand each example. Are you sure that software progogramming is what you should do for living?

    --

    Less is more !
  13. Contents of the book: by Anonymous Coward · · Score: 0, Funny

    Preface: Why Python? I have no idea.
    Introduction: After reading the Preface, you'll come to the same conclusion as I have.
    Chapter 1: use perl
    Chapter 2: use perl
    Chapter 3: use perl
    Chapter 4: use perl ...
    Conclusion: use perl, if you must use python, write your code in perl, and exec it from python.