Book Review: PostgreSQL 9.0 High Performance
eggyknap writes "Thanks in large part to the oft-hyped 'NoSQL' movement, database performance has received a great deal of press in the past few years. Organizations large and small have replaced their traditional relational database applications with new technologies like key-value stores, document databases, and other systems, with great fanfare and often great success. But replacing a database system with something radically different is a difficult undertaking, and these new database systems achieve their impressive results principally because they abandon some of the guarantees traditional database systems have always provided." Keep reading for the rest of eggyknap's review.
PostgreSQL 9.0 High Performance
author
Gregory Smith
pages
Packt Publishing
publisher
468
rating
Packt Publishing
reviewer
184951030X
ISBN
takes the reader step-by-step through the process of building an efficient and responsive database using "the world's most advanced open source database"
summary
8/10
For those of us who need improved performance but don't have the luxury of redesigning our systems, and even more for those of us who still need traditional transactions, data integrity, and SQL, there is an option. Greg Smith's book, PostgreSQL 9.0 High Performance takes the reader step-by-step through the process of building an efficient and responsive database using "the world's most advanced open source database".
Greg Smith has been a major contributor to PostgreSQL for many years, with work focusing particularly on performance. In PostgreSQL 9.0 High Performance, Smith starts at the lowest level and works through a complete system, sharing his experience with systematic benchmarking and detailed performance improvement at each step. Despite the title, the material applies not only to PostgreSQL's still fairly new 9.0 release, but to previous releases as well. After introducing PostgreSQL, briefly discussing its history, strengths and weaknesses, and basic management, the book dives into a detailed discussion of hardware and benchmarking, and doesn't come out for 400 pages.
Databases vary, of course, but in general they depend on three main hardware factors: CPU, memory, and disks. Smith discusses each in turn, and in substantial detail, as demonstrated in a sample chapter available from the publisher, Packt Publishing. After describing the various features and important considerations of each aspect of a database server's hardware, the book introduces and demonstrates powerful and widely available tools for testing and benchmarking. This section in particular should apply easily not only to administrators of PostgreSQL databases, but users of other databases, or indeed other applications as well, where CPU, memory, or disk performance is a critical factor. Did you know, for instance, the difference between "write-through" and "write-back" caching in disk, and why it matters to a database? Or did you know that disks perform better depending on which part of the physical platter they're reading? How does memory performance compare between various common CPUs through the evolution of their different architectures?
At every step, Smith encourages small changes and strict testing, to ensure optimum results from your performance efforts. His discussion includes methods for reducing and correcting variability, and sticks to easily obtained and interpreted tools, whose output is widely understood and for which support is readily available. The underlying philosophy has correctly been described as "measure, don't guess," a welcome relief in a world where system administrators often make changes based on a hunch or institutional mythology.
Database administrators often limit their tools to little more than building new indexes and rewriting queries, so it's surprising to note that those topics don't make their appearance until chapters 9 and 10 respectively, halfway through the book. That said, they receive the same detailed attention given earlier to database hardware, and later on to monitoring tools and replication. Smith thoroughly explains each of the operations that may appear in PostgreSQL's often overwhelming query plans, describes each index type and its variations, and goes deeply into how the query planner decides on the best way to execute a query.
Other chapters cover such topics as file systems, configuration options suitable for various scenarios, partitioning, and common pitfalls, each in depth. In fact, the whole book is extremely detailed. Although the tools introduced for benchmarking, monitoring, and the like are well described and their use nicely demonstrated, this is not a book a PostgreSQL beginner would use to get started. Smith's writing style is clear and blessedly free of errors and confusion, as is easily seen by his many posts on PostgreSQL mailing lists throughout the years, but it is deeply detailed, and the uninitiated could quickly get lost.
This is also a very long book, and although not built strictly as a reference manual, it's probably best treated as one, after an initial thorough reading. It covers each topic in such detail that each must be absorbed before further reading can be beneficial. Figures and other non-textual interruptions are, unfortunately, almost nowhere to be found, so despite the author's clear and easy style, it can be a tiring read.
It is, however, one of the clearest, most thorough, and best presented descriptions of the full depth of PostgreSQL currently available, and doubtless has something to teach any frequent user of a PostgreSQL database. Those planning a new database will welcome the straightforward and comprehensive presentation of hardware-level details that are difficult or impossible to change after a system goes into production; administrators will benefit from its discussion of configuration options and applicable tools; and users and developers will embrace its comprehensive description of query planning and optimization. PostgreSQL 9.0 High Performance will be a valuable tool for all PostgreSQL users interested in getting the most from their database.
You can purchase PostgreSQL 9.0 High Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Greg Smith has been a major contributor to PostgreSQL for many years, with work focusing particularly on performance. In PostgreSQL 9.0 High Performance, Smith starts at the lowest level and works through a complete system, sharing his experience with systematic benchmarking and detailed performance improvement at each step. Despite the title, the material applies not only to PostgreSQL's still fairly new 9.0 release, but to previous releases as well. After introducing PostgreSQL, briefly discussing its history, strengths and weaknesses, and basic management, the book dives into a detailed discussion of hardware and benchmarking, and doesn't come out for 400 pages.
Databases vary, of course, but in general they depend on three main hardware factors: CPU, memory, and disks. Smith discusses each in turn, and in substantial detail, as demonstrated in a sample chapter available from the publisher, Packt Publishing. After describing the various features and important considerations of each aspect of a database server's hardware, the book introduces and demonstrates powerful and widely available tools for testing and benchmarking. This section in particular should apply easily not only to administrators of PostgreSQL databases, but users of other databases, or indeed other applications as well, where CPU, memory, or disk performance is a critical factor. Did you know, for instance, the difference between "write-through" and "write-back" caching in disk, and why it matters to a database? Or did you know that disks perform better depending on which part of the physical platter they're reading? How does memory performance compare between various common CPUs through the evolution of their different architectures?
At every step, Smith encourages small changes and strict testing, to ensure optimum results from your performance efforts. His discussion includes methods for reducing and correcting variability, and sticks to easily obtained and interpreted tools, whose output is widely understood and for which support is readily available. The underlying philosophy has correctly been described as "measure, don't guess," a welcome relief in a world where system administrators often make changes based on a hunch or institutional mythology.
Database administrators often limit their tools to little more than building new indexes and rewriting queries, so it's surprising to note that those topics don't make their appearance until chapters 9 and 10 respectively, halfway through the book. That said, they receive the same detailed attention given earlier to database hardware, and later on to monitoring tools and replication. Smith thoroughly explains each of the operations that may appear in PostgreSQL's often overwhelming query plans, describes each index type and its variations, and goes deeply into how the query planner decides on the best way to execute a query.
Other chapters cover such topics as file systems, configuration options suitable for various scenarios, partitioning, and common pitfalls, each in depth. In fact, the whole book is extremely detailed. Although the tools introduced for benchmarking, monitoring, and the like are well described and their use nicely demonstrated, this is not a book a PostgreSQL beginner would use to get started. Smith's writing style is clear and blessedly free of errors and confusion, as is easily seen by his many posts on PostgreSQL mailing lists throughout the years, but it is deeply detailed, and the uninitiated could quickly get lost.
This is also a very long book, and although not built strictly as a reference manual, it's probably best treated as one, after an initial thorough reading. It covers each topic in such detail that each must be absorbed before further reading can be beneficial. Figures and other non-textual interruptions are, unfortunately, almost nowhere to be found, so despite the author's clear and easy style, it can be a tiring read.
It is, however, one of the clearest, most thorough, and best presented descriptions of the full depth of PostgreSQL currently available, and doubtless has something to teach any frequent user of a PostgreSQL database. Those planning a new database will welcome the straightforward and comprehensive presentation of hardware-level details that are difficult or impossible to change after a system goes into production; administrators will benefit from its discussion of configuration options and applicable tools; and users and developers will embrace its comprehensive description of query planning and optimization. PostgreSQL 9.0 High Performance will be a valuable tool for all PostgreSQL users interested in getting the most from their database.
You can purchase PostgreSQL 9.0 High Performance from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
I've not read the book but have read comments from several developers who contribute to PostgreSQL. All comments I've read on this book give it a strong thumbs up.
Take it or leave it, but based on feedback from the people who know the internals of PostgreSQL, this book is worth owning if PostgreSQL is important to you in the least.
I realize this is off-topic, but I have e-mailed the admins and gotten no response so I figured a fellow slashdot member might be able to help me.
Ever since the new design went live, slashdot seems to ignore my preferences. For example, I have "book reviews" filtered so I do not see the entire article (since I am not interested) and yet they still pop up on the front page with full text. It's a minor annoyance, but I was wondering if this is just a problem in the new design or something I'm doing wrong.
Similarly, the exclusions by keyword does not seem to be working beyond a single word (in my case "idle"). If I put multiple keywords (like "bookreview") it makes my entire front page empty. I have tried with spaces, commands, semi-colons, etc.
Again, I realize this is grossly off-topic, but I am hoping a fellow user could help me make my slashdot experience better (or back to how I enjoyed it).
Thanks!
While relational and non-relational databases have been on the rise over the last thirty years, largely owing to their ease-of-use, portability, and documentation, there is something to be said for going directly to the source.
The fastest possible way to store and retrieve data of all shapes and sizes is B-trees. These are commonly used in video games but can also sometimes be found in business applications. The fastest possible code is assembler, hand-optimized by those who are good in assembler; not coincidentally, assembler is also used where performance is of utmost importance (also video games.)
Rookies may be content with letting somebody else do the work of storing their data, but if you want the best hire a video game programmer to code your data backend solution. These are the people who work in frames per second.
unfortunately for me i am in the process of using postgres with the postgis add-on to do spatial analysis of all types in the database, instead of, lets say, using java to do all your spatial awareness, intersections and so forth, or writing your own code to do all of it. This leads to better code maintenance, performance and other optimizations. I am part of the Postgres/postgis mailing list and the amount of questions regarding spatial queries in the database has risen immensely. Too bad this book doesn't tackle this.
Summary: 8/10
...
Pages: Packt Publishing
Publisher: 468
Achille Talon
Hop!
You need this book: PostGIS in Action
NoSQL is great solution for some problems, but it's hardly a panacea. Gavin Roy did a presentation at the last PgCon comparing Pg to a number of the NoSQL options. I thought it was interesting reading.
http://www.pgcon.org/2010/schedule/events/219.en.html
Time and time again, the question of Oracle-like hints for PostgreSQL pops up on the PostgreSQL mailing lists. I thought I share some links as I find the topic fairly interesting. Hopefully the DBAs out there will too.
Why PostgreSQL Doesn't Have Query Hints
Why PostgreSQL Already Has Query Hints
Plan Tuner - Ripped from the above link
And in case you don't know, this is a great place to stay current with PostgreSQL development and technology.
There's also Hinting at PostgreSQL, by the author of the book being reviewed here (me), covering what hinting mechanism are available. And finally Why the F&#% Doesn't Postgres Have Hints?!?! suggesting why some feel that still isn't enough.
Thanks to Joshua for the nice review here. There are actually a few more samples from the book than just the one chapter; here's a full list of them:
In addition to this one and the customer reviews at Amazon, there have been two other reviews by notable PostgreSQL contributors: Buy this book, now and PostgreSQL 9 High Performance Book Review.
As alluded to in the intro, the book tries to cover PostgreSQL versions from 8.1 though 9.0, with a long listing of what has changed between each version to help you figure out what material does and doesn't apply. So most of the advice applies even if you're running an older version. There is also a companion volume to this one of sorts also available, PostgreSQL 9 Admin Cookbook, that was written at the same time and coordinated such that there's little overlap between the two titles. That one focuses more on common day to day administration challenges, less on the theory.
Official pronounciation recording. Postgres-Q-L. "Postgres" is also fine.