Linux Troubleshooting
norburym writes "The Bruce Perens Open Source Series of books published by Prentice Hall PTR is a strong collection of nearly 20 volumes focusing on Linux and open source technology. Edited by Linux guru and former Debian GNU/Linux Project Leader, Bruce Perens, the books are aimed toward developers, sysadmins and power users. Several months following the release of a new print volume, a free electronic version is made available on Prentice Hall PTR's web site. The series includes some excellent editions including Official Samba-3 HOWTO and Reference Guide (2nd ed.), Linux Quick Fix Notebook and PHP 5 Power Programming. The newest book by Mark Wilding and Dan Behman,
Self-Service Linux: Determining Problems and Finding Solutions, is another well-written and worthy companion to this series." Read the rest of Mary's review.
Self-Service Linux: Determining Problems and Finding Solutions
author
Mark Wilding and Dan Behman
pages
456
publisher
Prentice Hall, PTR
rating
8
reviewer
Mary Norbury-Glaser
ISBN
013147751X
summary
Linux Troubleshooting
This is not a basic Linux HOW TO book: authors Wilding and Behman take the reader to a level past the introductory Linux OS installation instructions and KDE/GNOME window dressing changes. In all real life scenarios and at some critical point, a Linux user or admin will need to troubleshoot some aspect of the system they use at home or the systems they manage on the job. This book is for that power user, systems administrator or developer who will, out of base necessity, require a proper bag of tools and practical guidance in establishing an effective set of skills for troubleshooting one or more Linux systems.
A quick scan of the table of contents gives a very abbreviated summary of the book and actually belies the depth of the contents. The authors break the chapters into very self-contained topics including best practices and initial investigations, strace and system call tracing explained, the /proc filesystem, compiling, GDB (GNU Debugger), Linux system crashes and hangs, and kernel debugging, among others. These chapters are filled with detailed examples that perfectly illustrate real world scenarios that any Linux user will be familiar with.
Chapter 1 is an overview of the complex process of problem determination and resolution and begins with steps to configure your Linux system(s) for optimal troubleshooting. The authors outline a selection of tools they recommend the reader/user install on their Linux system(s) in anticipation of future problems: strace, ltrace, lsof, top, traceroute/tcptraceroute, ping, hexdump, tcpdump/ethereal, GDB and readelf. These and many others are categorized by type (process information and debugging, network, system information, files and object files, kernel and miscellaneous) in Appendix A, "The Toolbox." Wilding and Behman stress the importance of balancing the need to solve issues immediately vs. building troubleshooting skills. They outline four phases of problem investigation (using your own knowledge and skills to investigate, using the Internet, conducting a deeper investigation, and getting help) and discuss where the various tools fit into different scenarios, how to collect information about system changes, what resources are available on the Internet (Google, USENET, Bugzilla, etc.), how to handle more difficult problems and where and how to get outside help, if necessary.
Chapter 2 explains system call tracing, introduces the strace tool (traces system calls between a process and the kernel) and how to use it to diagnose errors related to the operating system. This is a very well organized chapter with plenty of depth. Wilding and Behman offer an extensive discussion of this first tool, progressing from simple examples that illustrate how to read the strace output, how/when to use strace options, timing system call activity, tracing an existing running process, to many practical debugging examples.
Chapter 3, "The /proc Filesystem," looks at user process information (/proc/self, /proc/<pid>, /proc/<pid>/environ, /proc/<pid>/mem), kernel information and manipulation (/proc/cpufreq, /proc/cpuinfo, /proc/devices, /proc/meminfo, /proc/partitions), and system information and manipulation (/proc/sys/fs, /proc/sys/kernel, /proc/sys/vm). The authors run through files and directories relative to /proc and describe how to view information about the kernel and currently running processes. This chapter gives a good example of how to use the "kernel magic sysrq key" feature (using the ALT-SysRq hotkeys to get kernel information) when a system hangs. Output from the commands showPC, showMem and showTasks are given as examples.
The next chapter details the GCC (GNU Compiler Collection) and compiling. The authors don't attempt to walk the reader through basic kernel source compiling but rather they concentrate on how to decipher errors that arise from compiling source. They give a basic outline of some basic compile failures (environment/setup errors, compiler version differences, user error, code error, etc.) then show a common error involving both incorrect code and different allowances made between compiler versions. Wilding and Behman show the reader how to decipher the kernel error and how to use both existing documentation and bugs posted on the Internet to correct the errors and rerun the compilation successfully. This is a very practical demonstration of how compile errors can be worked out and solved quickly.
Chapter 5 begins with a definition of "stacks" and a description of stack structure, local variables, and stack frames. Also shown is how to display the raw stack in a debugger like DDD (Data Display Debugger) or GDB (The Gnu Debugger) in order to perform a detailed stack analysis. The authors use the backtrace command to look at stack traceback output from GDB, "walking the stack" (manually walk the raw stack frame by frame using the dladdr function), common causes of stack corruption, and SIGILL signals.
Debugging applications is the subject of the next chapter with the majority of the chapter dedicated to The GNU Debugger. This is a logical place for a discussion on debuggers as the authors point out that they are particularly useful when problems can't be solved through log files, error messages, etc., when a problem is of an immediate nature (i.e. doesn't extend over a long period of time) and when source is available. GDB command line editing is covered along with how to control a live process by running the process directly through the debugger, how to attach to a running process and how to use a core file (or a process image) to perform debugging. The authors also examine viewing the memory map and variables, looking at the contents of register dumps, working with C++ (inline functions and exceptions), and problems with threaded applications. A brief description of the Data Display Debugger (DDD) GUI front-end to GDB is included at the conclusion of the chapter.
Chapter 7 deals with "System Crashes and Hangs" and how to assemble the appropriate information necessary to troubleshoot a system problem using various tools and techniques: using the syslog, setting up and using a serial console, using the SysRq kernel magic hotkey, examining the oops report generated by a manual kernel trap, considering hardware failure issues, and setting up cscope to index kernel sources. This chapter prepares the reader for documenting proper and extensive details about errors and problems not only for rapid diagnosis but also in the event he or she needs to call in an expert.
Kernel Debugging with KDB is a brief chapter that instructs the reader on how to enable and activate KDB, basic commands associated with its use, and some examples on how to use it. Several good illustrations of where KDB proves useful over other tools are included.
The final chapter explores the ELF file format (executable and linking format) for shared libraries and executables. The authors provide a comprehensive look at the ELF standard on Linux. They start with basic definitions and concepts (symbols names and C versus C++, linking with static libraries and run time linking, and run time linker) and prep the reader with some source code that is used in later chapter examples. They examine the ELF file structure (the header using hexdump, segments/sections with readelf, the program header table, and the section header table). This is probably the strongest chapter in the book. There is enough information and instruction in this chapter to arm a Linux system troubleshooter to follow the practical examples with little effort.
The book concludes with two valuable appendices that detail the authors' selected tools for Linux problem determination and include a data collector script intended to capture basic critical system information in the event of a problem. As discussed above, the "Toolbox" appendix is a summary of the authors' selection of best Linux tools for diagnosing problems. Each tool has a brief description, where to get it, level of usefulness, when to use the tool and additional notes. Appendix B, "Data Collection Script," offers the reader a sample bash script tool that gathers a broad range of system information. The authors provide several optional switches to increase the amount of data collected with the caveat that time to collect that information also increases.
Wilding and Behman assume some familiarity with the Linux system: their advice and instruction are intended for those users who are not afraid of the CLI and who understand basic Linux operating and file system structure. That said, Self-Service Linux: Mastering the Art of Problem Determination is a valuable resource for advanced users and system administrators. In short, this book is for anyone who uses Linux on a daily basis on one or multiple systems. The examples are fully detailed: the reader gets commands, options, output, sample code, and a variety of possible outcome scenarios. Wilding and Behman set out a realistic and practical approach to problem solving; they satisfy the troubleshooter in all of us. Self-Service Linux is a welcome addition to the Bruce Perens Open Source series of Prentice Hall PTR professional reference books."
You can purchase Self-Service Linux: Determining Problems and Finding Solutions from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This is not a basic Linux HOW TO book: authors Wilding and Behman take the reader to a level past the introductory Linux OS installation instructions and KDE/GNOME window dressing changes. In all real life scenarios and at some critical point, a Linux user or admin will need to troubleshoot some aspect of the system they use at home or the systems they manage on the job. This book is for that power user, systems administrator or developer who will, out of base necessity, require a proper bag of tools and practical guidance in establishing an effective set of skills for troubleshooting one or more Linux systems.
A quick scan of the table of contents gives a very abbreviated summary of the book and actually belies the depth of the contents. The authors break the chapters into very self-contained topics including best practices and initial investigations, strace and system call tracing explained, the /proc filesystem, compiling, GDB (GNU Debugger), Linux system crashes and hangs, and kernel debugging, among others. These chapters are filled with detailed examples that perfectly illustrate real world scenarios that any Linux user will be familiar with.
Chapter 1 is an overview of the complex process of problem determination and resolution and begins with steps to configure your Linux system(s) for optimal troubleshooting. The authors outline a selection of tools they recommend the reader/user install on their Linux system(s) in anticipation of future problems: strace, ltrace, lsof, top, traceroute/tcptraceroute, ping, hexdump, tcpdump/ethereal, GDB and readelf. These and many others are categorized by type (process information and debugging, network, system information, files and object files, kernel and miscellaneous) in Appendix A, "The Toolbox." Wilding and Behman stress the importance of balancing the need to solve issues immediately vs. building troubleshooting skills. They outline four phases of problem investigation (using your own knowledge and skills to investigate, using the Internet, conducting a deeper investigation, and getting help) and discuss where the various tools fit into different scenarios, how to collect information about system changes, what resources are available on the Internet (Google, USENET, Bugzilla, etc.), how to handle more difficult problems and where and how to get outside help, if necessary.
Chapter 2 explains system call tracing, introduces the strace tool (traces system calls between a process and the kernel) and how to use it to diagnose errors related to the operating system. This is a very well organized chapter with plenty of depth. Wilding and Behman offer an extensive discussion of this first tool, progressing from simple examples that illustrate how to read the strace output, how/when to use strace options, timing system call activity, tracing an existing running process, to many practical debugging examples.
Chapter 3, "The /proc Filesystem," looks at user process information (/proc/self, /proc/<pid>, /proc/<pid>/environ, /proc/<pid>/mem), kernel information and manipulation (/proc/cpufreq, /proc/cpuinfo, /proc/devices, /proc/meminfo, /proc/partitions), and system information and manipulation (/proc/sys/fs, /proc/sys/kernel, /proc/sys/vm). The authors run through files and directories relative to /proc and describe how to view information about the kernel and currently running processes. This chapter gives a good example of how to use the "kernel magic sysrq key" feature (using the ALT-SysRq hotkeys to get kernel information) when a system hangs. Output from the commands showPC, showMem and showTasks are given as examples.
The next chapter details the GCC (GNU Compiler Collection) and compiling. The authors don't attempt to walk the reader through basic kernel source compiling but rather they concentrate on how to decipher errors that arise from compiling source. They give a basic outline of some basic compile failures (environment/setup errors, compiler version differences, user error, code error, etc.) then show a common error involving both incorrect code and different allowances made between compiler versions. Wilding and Behman show the reader how to decipher the kernel error and how to use both existing documentation and bugs posted on the Internet to correct the errors and rerun the compilation successfully. This is a very practical demonstration of how compile errors can be worked out and solved quickly.
Chapter 5 begins with a definition of "stacks" and a description of stack structure, local variables, and stack frames. Also shown is how to display the raw stack in a debugger like DDD (Data Display Debugger) or GDB (The Gnu Debugger) in order to perform a detailed stack analysis. The authors use the backtrace command to look at stack traceback output from GDB, "walking the stack" (manually walk the raw stack frame by frame using the dladdr function), common causes of stack corruption, and SIGILL signals.
Debugging applications is the subject of the next chapter with the majority of the chapter dedicated to The GNU Debugger. This is a logical place for a discussion on debuggers as the authors point out that they are particularly useful when problems can't be solved through log files, error messages, etc., when a problem is of an immediate nature (i.e. doesn't extend over a long period of time) and when source is available. GDB command line editing is covered along with how to control a live process by running the process directly through the debugger, how to attach to a running process and how to use a core file (or a process image) to perform debugging. The authors also examine viewing the memory map and variables, looking at the contents of register dumps, working with C++ (inline functions and exceptions), and problems with threaded applications. A brief description of the Data Display Debugger (DDD) GUI front-end to GDB is included at the conclusion of the chapter.
Chapter 7 deals with "System Crashes and Hangs" and how to assemble the appropriate information necessary to troubleshoot a system problem using various tools and techniques: using the syslog, setting up and using a serial console, using the SysRq kernel magic hotkey, examining the oops report generated by a manual kernel trap, considering hardware failure issues, and setting up cscope to index kernel sources. This chapter prepares the reader for documenting proper and extensive details about errors and problems not only for rapid diagnosis but also in the event he or she needs to call in an expert.
Kernel Debugging with KDB is a brief chapter that instructs the reader on how to enable and activate KDB, basic commands associated with its use, and some examples on how to use it. Several good illustrations of where KDB proves useful over other tools are included.
The final chapter explores the ELF file format (executable and linking format) for shared libraries and executables. The authors provide a comprehensive look at the ELF standard on Linux. They start with basic definitions and concepts (symbols names and C versus C++, linking with static libraries and run time linking, and run time linker) and prep the reader with some source code that is used in later chapter examples. They examine the ELF file structure (the header using hexdump, segments/sections with readelf, the program header table, and the section header table). This is probably the strongest chapter in the book. There is enough information and instruction in this chapter to arm a Linux system troubleshooter to follow the practical examples with little effort.
The book concludes with two valuable appendices that detail the authors' selected tools for Linux problem determination and include a data collector script intended to capture basic critical system information in the event of a problem. As discussed above, the "Toolbox" appendix is a summary of the authors' selection of best Linux tools for diagnosing problems. Each tool has a brief description, where to get it, level of usefulness, when to use the tool and additional notes. Appendix B, "Data Collection Script," offers the reader a sample bash script tool that gathers a broad range of system information. The authors provide several optional switches to increase the amount of data collected with the caveat that time to collect that information also increases.
Wilding and Behman assume some familiarity with the Linux system: their advice and instruction are intended for those users who are not afraid of the CLI and who understand basic Linux operating and file system structure. That said, Self-Service Linux: Mastering the Art of Problem Determination is a valuable resource for advanced users and system administrators. In short, this book is for anyone who uses Linux on a daily basis on one or multiple systems. The examples are fully detailed: the reader gets commands, options, output, sample code, and a variety of possible outcome scenarios. Wilding and Behman set out a realistic and practical approach to problem solving; they satisfy the troubleshooter in all of us. Self-Service Linux is a welcome addition to the Bruce Perens Open Source series of Prentice Hall PTR professional reference books."
You can purchase Self-Service Linux: Determining Problems and Finding Solutions from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
A free electronic version is available for download - thats sweet :)
LINUX ONLINE POKER: Linux Poker
This book seems helpful for those who already got their feet wet in the linux seas, but what about for the ultimate linux n00b? I wouldnt mind reading a book which does a good job presenting linux to the absolute beginner.
Viable Slashdot alternatives: https://pipedot.org/ and http://soylentnews.org/
I know im very offtopic here, but the fact this is out for free download, yet he still manages to publish and see these books, prove that mp3 downloading doesn't really cut into music sales?
Hell i know that more mp3s are downloaded than books, but far less people will buy this book anyway.
- http://www.milkme.co.uk
Try getting Wireless WEP running on SUSE 10 and then tell us that Linux doesn't need toubleshooting!
Wireless Wireless Encryption Protocol brought to you by the Department of Redundancy Department...
Who uses KDB??
I can't find a download link or to read the series online. I'm getting blind ?
http://askaralikhan.blogspot.com/
Is there a troubleshooting guide in general for linux? A lot of newbies are intimidated by the number of hoops they have to jump through for things like setting up sound etc. I know it can be quite frustrating because I recall back when I first installed Ubuntu in different installations of the same version of Ubuntu, different tricks got my sound to work on my laptop. For that matter, is something like this feasible because of the various distros and the difficulty of hardware support? So far, the best one I've found is the Ubuntu Starter Guide, but it is distro specific...
PS: I've already checked the Linux Documentation Project
No, any OS can have trouble that needs to be corrected. It's just with Linux, you don't just restart a couple times a day to try to solve the problem ;-)
What a big topic. It looks like Slashdot have copied and pasted most of the book on here
- There's no place like 127.0.0.1
is the man pages and other system documentation. Correct, updated and relevant man pages/documentation is a gold mine for trouble shooting and configuration.
I thought Id give it a shot....only to find this listed as a title thats "coming soon". Am I missing a link here?
fsck Congratulations, you now know how to solve most of the problems you will ever have with Linux.
Bad puns gave me bad karma. =(
...but these are not those books. If you want an absolute beginner's looks at Linux, check out Marcel Gagné's excellent Moving to Linux, Second Edition : Kiss the Blue Screen of Death Goodbye! (direct link, no paid click-through). Also worth noting is Paul Sheer's LINUX: Rute User's Tutorial and Exposition, that while a bit dated covers the basic concepts of Linux quite thoroughly and also makes quite a good reference guide. I would start with the former, but the latter is free online and the paperback inexpensive to purchase.
Working in a DevOps shop is like playing in a band made up entirely of keytarists.
If you are a very experienced user... this could be interesting. If you don't think yourself to be a very, very, VERY experienced Linux user, this isn't your book. Even so, again, most things are determined through much, much, much easier techniques.
Not a great troubleshooting book IMHO. Good book for someone who thinks they know it all.
See my review at: http://theendlessnow.com/ten/Main/BookReviews#toc9
Bill? Is that you?
Install Windows. It'll make all of those Linux headaches go away.
Why is this marked as troll? The parent is absolutely correct. Amazon's price is $17 bucks cheaper than B&N. Perhaps Slashdot should consider posting alternative links to Amazon in their book reviews now? I thought the "boycott" was over a while ago.
fsck Congratulations, you now know how to solve most of the problems you will Never have with Linux.
"Nae Kin! Nae Quin! Nae laird! Nae master! We willna be fooled again!"
Mods. Come on. Common sense. That was intended to be funny.
I have also bought this book for myself. And it is everything what the reviewer says it is. In fact, I decided to buy this book after reading a review of the book at http://linuxhelp.blogspot.com/2005/12/book-review- self-service-linux.html. And boy! I was able to set right quite a few problems related to softwares after reading this book. I strongly recommend buying this book.
Linus is that you?
Remember folks, slashdot doesn't have a -1 "disagree" moderation!
see also http://www.linuxtroubleshooting.com/
Another site about linux troubleshooting techniques and
strategies. Unaffiliated with the book.
...we should publish a similar book for the n00bies, the cheerleaders, the lingerie models, and the occasional receptionist.
I thought I'd throw in the last three, you know, to bring some sort of balance of nature, and stuff.
In MY day, we just used a good old logic probe, and an analog meter. Now THAT'S troubleshooting.
Luser: My spreadsheet won't open.
Tech: Grab the big blue capacitor, and call me back. . . .
You have a constitutionally protected right to be wrong, and I the right to ignore you.
Well, that's not really Linux problem. If you, like me, have wlan card from fucked-up vendor (Broadcom in my case) who won't release specs to kernel developers, those guys can't do much.
Obviously not so busy that you don't have time to post on /.
Which manpages are incorrect? Have you informed the authors? Do you know how to make a patch?
Don't be a dumbass please. I contribute plenty. And I'm not going to justify having a few minutes to post while keeping up with the community.