Domain: aw-bc.com
Stories and comments across the archive that link to aw-bc.com.
Stories · 2
-
Bioinformatics in the Post-Genomic Era
nazarijo (Jose Nazario) writes "As a biochemist by training, Jeff Augen's Bioinformatics in the Post-Genomic Era was very interesting to me. Though I left the field some years ago, I was using the bioinformatics tools that are covered in the book daily and still look in from time to time. Naturally I was curious to see a larger perspective, as well as any progressions, that have occurred in the past few years. Augen's book gave me part of the larger picture, but it could have done more." Read on for the rest of Nazario's review. Bioinformatics in the Post-Genomic Era author Jeff Augen pages 388 publisher Addison-Wesley Longman rating 7 reviewer Jose Nazario ISBN 0321173864 summary Genome, Transcriptome, Proteome, and Information-Based MedicineBioinformatics is the science of biological information, namely sequences and metadata about organisms and sequences. What's interesting about this field to many people, both in the sciences and outside of it, is the large volume of data that gets analyzed and the results that emerge on a daily basis. Obviously interesting for the medical advances and the rapidly growing business in the life sciences, there's a complex field that has developed in the past ten years or so. And following the sequencing of the human genome, new challenges have arisen for everyone involved. Augen's Bioinformatics provides a good introduction to this new field of research for students in the sciences, and anyone with a decent undergraduate education in modern biology. I think that this accessibility of the material is one of the book's biggest winning points.
After an introduction to the book and the subject area of bioinformatics (chapters 1 and 2), Augen begins at the level of the structure of a gene (chapter 3). Here, anyone with an undergraduate level understanding of genetics or molecular biology can begin using the book and bridging the gap to the new areas of modern bioinformatics. Augen then describes how basic sequence analysis is performed at the DNA sequence level (in chapter 4). The material in Bioinformatics covers some of the higher-level methods for sequence analysis, including hidden Markov models, neural networks, and pattern discovery, and introduces some of the common algorithms found to do this analysis.
Chapter 5 then covers transcription, the process of going from DNA to mRNA. Beginning with the biology behind this activity (the ribosome and the larger "transcriptome"), Bioinformatics then describes how you would perform transcriptional analysis. Here, Augen shows how you go from a wet lab to a computational lab and describes what classes of experiments you perform to gather data and then what kinds of analysis you perform on it. This chapter introduces some of the more common clustering techniques for data aggregation and understanding.
The next step in the DNA -> RNA -> protein chain is found in chapter 6, which covers the translation process. Coupled to chapter 7, which describes protein structure prediction and searching, these two chapters bridge the next gap between laboratory data and computational analysis. Protein folding and structure analysis was one of my pet areas of study as a graduate student, and Augen's text does a decent summarization of the field to date. The resources listed and techniques described are definitely on par with the common practices in the field.
Finally, Bioinformatics gets into the next major area of bioinformatics, medical databases. Augen's bridge from genetics to medical science is complete, and he discusses how medical professionals utilize databases and can begin to predict disease, for example, based on data mining. The final chapter, "New Themes in Bioinformatics," covers exactly that, but also what Augen refers to as "workflow computing," or basically going about being a bioinformatics scientist. One of my favorite emerging areas in bioinformatics, metabolic pathway elucidation, is also covered briefly.
I've shared this book with a few friends who are all studying computer science or practicing computer scientists. I did so because Augen's material does a good job of explaining my background and introducing them to some of the analysis forms I introduce into my own work. It does a good job of that, and gets them quite excited. Bioinformatics really bridges a number of fascinating areas of computer sciences, including data mining and high performance algorithms. Augen's Bioinformatics is a good introduction to the field for them, and really anyone who has studied a couple of biology courses in college.
Where the book falls short, however, can be grouped into two main areas. The first is the failure of Augen's presentation of the algorithms. While the methods used to describe computational algorithms in Bioinformatics is common for non-computer scientists, it's completely unusable for computer scientists who are used to a specific algorithm presentation style that looks more like pseudocode than rambling text. The ambiguities this presents for a technical reader are unfortunate, especially if anyone studying bioinformatics is supposed to be computer science literate. The book itself assumes a life science literacy, so this isn't an unreasonable expectation of the reader.
The second area that consistently falls short in the book is in the utility of the information given. While I am significantly happier with the quality and depth of material presented in Augen's book than in the O'Reilly bioinformatics series, where the book fails to deliver is in showing the reader how to actually use the data they gather. After all, the book shows various sequence analysis algorithms and discusses tools available to do this work, but it only devotes a few pages (out of over 370 in total) to a workflow that can be used. Also, the book fails to point the reader at very worthwhile web resources sometimes, including meta sites like the SDSC Biology Workbench site, and just says "some Perl scripts" for local data analysis. As such, you'll have to go a few extra miles on your own to make use of the data sources.
I guess a third complaint of the book for me is that Augen has ignored or omitted significant bodies of research that fit squarely into the scope of the book. For example, Ken Dill's research into protein folding models, as well as Martin Karplus' work on the subject, receives no mention, nor does the topic of Bayesian network analysis when Augen discusses time series data analysis. These aren't new, they've been around for many years and influenced most of the field, and their absence is noted. The book's spotty coverage in some places, like these, is noticeable.
Bioinformatics does a few things well, but overall reads too much like a biology textbook to be useful to the average computer scientist. More emphasis on the practice of bioinformatics and data analysis would have made this book stronger and complemented the substantive background material well. Finally, using an approach more similar to the computer science approach would have been a tremendous benefit, since the material really is computer science in part. That said, I think this is probably the best introduction to this exciting area of science that I have yet seen.
You can purchase Bioinformatics in the Post-Genomic Era from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
BSD Hacks
GMan00 writes "A flurry of BSD UNIX-related (Berkeley Software Distribution) books have hit the bookstores during the recent past, and more are on the way. From books specific to Secure Architectures with OpenBSD in April 2004 and the reissue of The Design and Implementation of the BSD Operating System for FreeBSD 5.x (expected in August 2004), to Michael Lucas' series of BSD Books from NoStarch Press, print documentation is certainly available for those interested in learning about the free, open source UNIX system which powers operations such as Yahoo! portal and Sendmail.org website, Verio and Pair hosting, not to mention web server survey site Netcraft. Dru Lavigne's BSD Hacks (O'Reilly and Associates, May 2004), is the latest book in these releases, and is an enormously useful resource for system administrators and end-users alike." Read on for the rest of George's review. BSD Hacks author Dru Lavigne pages 427 publisher O'Reilly & Associates rating 10 reviewer George ISBN 0596006799 summary A great array of hacks you can perform on your BSD box, many applicable to all the BSDs, including FreeBSD, NetBSD, OpenBSD and Darwin/OS X.Dru writes the BSD Basics column on O'Reilly & Associates' OnLamp. Her clarity and fluid style are perfect for those looking to understand aspects of the BSD operating systems. I have had some email communications with Dru about various New York City *BSD User Group-related activities, and managed to speak with her several times at BSDCan this past May.
Like most computer nerds, Dru has a sense of humor. Unlike most, however, she's actually funny.
BSD Hacks is the first book that is almost solely focused on hacks for sysadmins, without boring you with the details for basic operating system installation and configuration that has been so well documented elsewhere. BSD Hacks is not just for sysadmins, though. Intermediate and advanced BSD users will also find the book an excellent tool. For those who find difficulty in BSD installs and other fundamentals, on the other hand, it's best to start with the FreeBSD Handbook, the NetBSD Guide or the OpenBSD FAQ.
There's lots of good hacks buried in the various BSD books, around the internet in different HOWTOs and tutorials. But BSD hacking is the sole purpose of BSD Hacks; there's no need to browse through install screens and overviews of TCP/IP before getting to the heart of the matter.
With 100 listed hacks, multiplied by an impressive level of detailed angles for each, Dru provides an array that demands the placement of this book right in your server room, not in a pile of "must-read-at-some-distant-point-in-the-future" texts.
The majority of hacks are applicable to all the BSDs, including Darwin and OS X, although some are specific to one BSD or another.
This review obviously can't list every hack, although you would be smart to sit and work through the book yourself over a weekend or two. But it is possible to provide a good flavor of BSD Hacks in brief. O'Reilly and Associates does give a good glimpse on their Sample Hacks page, but let's do a quick work through ourselves.
The first chapter is called "Customizing the User Environment," and is probably best for end-users looking to go beyond their first steps. But it does include some useful hacks, such as "Use an Interactive Shell" that certainly fit well into the arsenal of any sysadmin, not to mention Hack #12 "Use Multiple Screens on One Terminal."
The second chapter, "Dealing with Files and Filesystems" also contains gems for both end-users and sysadmins. The use of mtree, which maps a directory hierarchy, is mentioned as a tool for recovery. Later on in chapter 6, Dru details its use for making a hacked data integrity checker, thus filling the role often played by products such as Tripwire.
Another great tool Dru covers in the second chapter is g4u, a free ghosting program that gives you the ability to perform quick restores over ftp. Ghosting a drive image is an incredibly useful tool, whether it's about replicating servers or doing a quick reinstall and configuration when a server fails in an emergency.
Chapter 3 is entitled "Boot and Login Environments." It gives some hacks that aren't just for basic system administration, but also some useful security ones including changing your /etc/passwd file to Blowfish encryption and utilizing OPIE for one-time passwords, which is built into FreeBSD.
"Backup Up" is the focus of Chapter 4. It includes some very creative methods of dealing with maintaining that necessity, and also includes an excellent primer on Bacula, which is increasingly gaining prominence as a cross-platform backup system.
Chapter 5 covers "Network Hacks," and continues on educating a sysadmin. Included in this chapter is the tcpdump program, a vital tool for watching traffic flowing by your network interfaces.
There's a strong security focus in Chapter 6, entitled "Securing the System." While security hacks are sprinkled generously throughout the book, this chapter works with firewalling with IPF and PF, in addition to covering SSH and Snort. It also includes the earlier mentioned 'intrusion detection-lite' approach with mtree.
Chapter 7, "Going Beyond the Basics" explores scripting, analyzing dreaded buffer overflows and more. Dru also includes a bit on "Creating a Trade Show Demo," not something you'd expect documented in print anywhere, but nevertheless quite useful for anyone working for the BSDs at a conference.
Dru continues with "Keeping Up-to-Date" in Chapter 8, which includes useful details on upgrading and downgrading your installed ports.
The final chapter is "Grokking BSD." "Grok," as Dru comments, refers to the science fiction writer Heinlein's Martian phrase for having a "thorough understanding." Dru covers creating your own manual pages, dealing with custom patches, playing with dictionaries and more.
Certainly there are no walls between each chapter, as many of the hacks could be shifted around. All the more reason to work your way through the book from beginning to end.
One useful addition for this book could have been somehow denoting which of the BSDs (in some cases, it's all of them) to which each listed hack can be applied. Certainly not all are available to Darwin and Apple's OS X. And certainly there's no point in making the OpenBSD /etc/passwd file encrypted in Blowfish, since that is its default.
While many of the hacks are found somewhere in the manual pages, on some useful website, buried in another book or in the minds of some developer somewhere, they're not necessarily in the annals of official documentation. But there's no single book or site that provides the depth and breadth that Dru provides. She managed to tap into the thoughts of dozens of developers and sysadmins around the world, greatly enhancing the variety of hacks in this book.
As a side note, the scope of BSD Hacks isn't limited to just the BSD family. Many of these are likely applicable to Linux and the other UNIX systems. But with recent, impressive increases in the BSD install base, there's a good chance that you can access a BSD box somewhere.
Whether you're a sysadmin managing hundreds of servers, or a power user ready to go beyond the obvious, BSD Hacks belongs next to your CRT.
You can purchase BSD Hacks from bn.com. Slashdot welcomes readers' book reviews. To see your own review here, carefully read the book review guidelines, then visit the submission page.