PostgreSQL 9.0 Released

← Back to Stories (view on slashdot.org)

Posted by Soulskill on Monday September 20, 2010 @11:10AM from the milestone-achieved dept.

poet writes "Today the PostgreSQL Global Development Group released PostgreSQL 9.0. This release marks a major milestone in the PostgreSQL ecosystem, with added features such as streaming replication (including DDL), Hot Standby and other nifty items like DO. The release notes list all the new features, and the guide explains them in greater detail. You can download a copy now."

5 of 344 comments (clear)

Min score:

Reason:

Sort:

Cool by iONiUM · 2010-09-20 11:16 · Score: 4, Interesting

I read the notes, noticed the Column and WHEN triggers. Is this in other SQL databases? If it is, I haven't seen it before. In any case, it's pretty cool that you can setup triggers on a conditional statement. That would really help me out in a lot of scenarios, as I work in the BI space, so alerting is a big deal.
1. Re:Cool by lanner · 2010-09-20 12:17 · Score: 4, Interesting
  
  I hate to say it, but good/useful features like that will be abused by stupid DB guys who can't program.
  Once upon a time I worked in the entertainment industry and was working on a big MMO game project.
  Company X could not scale up their game clusters past about 1000 players. Somewhere between 1000 and 2000 players, the game would just start bogging down and in-game events piled up and everything trainwrecked and was unplayable.
  So, it turns out that most of the game logic was built off of complicated SQL stored procedures, triggers, logic, etc. Basically, they were using their hard drive as a processor.
  The problem was with the MS-SQL server disk IO Wait. CPU was okay on all of the systems, but they could just not imagine that the disks in the database server (only one DB server per cluster) could be the source of the problems. Every time there was an item dropped, crafted, or certain other special things happened, there was an atomic commit and that basically required writing to disk on the spot. Get enough of that going and you're whole 20-something CPU cluster sits with idle CPU while the DB server works it's hard drives.
  Company went chapter 11, all staff eventually let go, and later was sold off for nothing.
  I had pointed out this problem to them, but it was late in development and when you tell the people who are responsible for designing the product that they are idiots, well, they behave like idiots and don't really listen. Not that they could have fixed it anyway due to time and intellect restraints.
  Anyway, point of the story is that cool SQL features are cool. But don't use your hard drive as a processor.
2. Re:Cool by GooberToo · 2010-09-20 16:41 · Score: 4, Interesting
  
  There is a difference between the engine checking a constraint versus a call into an interpreted language. One is doing less work. The other is doing more work. Which is ideal? Obviously less work is better. And that's before you even get into the PL/SQL code which is essentially doing the same work, but slower. Furthermore, all too often, triggers are called when there is no work to be done but you don't know that until the PL code decides this is the type of row change its interested, otherwise it should have really been a NOP. Whereas with the column trigger, the call to the PL code simply never takes place. So we not only save on the call but all of the wasted time inside of a trigger which ultimately decides its has nothing to do.
  Also, when the trigger should be called, in a row trigger, triggers frequently must evaluate which columns have changed before it can even determine if it cares about this row. Should it then decide it does care about this row, likely you've already passed through a mass of CASE and/or IF/THEN/ELSE codes, which ultimately states yet more CASE and/or IF/THEN/ELSE to determine exactly what it should now be doing now that its decided it does need to process this row. Or, you can call a much smaller section of code which is dramatically simplified because one, its only called when its pre-qualified (saving the creation of much redundant code) and two, since its now pre-qualified, we can immediately get to performing whatever logic the trigger in question should do when the column in question has changed.
  Those are worlds apart in performance, readability, maintainability. Not to mention the added granularity makes possible a reduction in the test matrix, regression tests, and even makes it more difficult (though far from impossible) to create a regression.
As always... by jd · 2010-09-20 11:47 · Score: 5, Interesting

The new features are much admired by all (and deservedly so), but a heavier footprint typically means poorer performance overall even if there's accelerated performance in specific areas or improved programming. I'd like to see a performance plot, showing version versus performance versus different types of system load, in order to see how well new stuff is being added in. It might be merged in great and the underlying architecture may be superb, but I would like to see actual data on this.
Also, PostgreSQL and MySQL aren't the only Open Source SQL databases. Including variants and forks, you really need to also consider Ingres, Drizzle, MariaDB, SAP MaxDB, FireBird and SQLite. If you want to also compare against Closed Source DBs, then you'd obviously want to look at DB/2, Oracle, Cache and Sybase. I'd love to see a full comparison between all of these, feature-for-feature, with no bias for or against any specific development model or database model, but rather an honest appraisal of how each database performs at specific tasks.
I like PostgreSQL a lot. I rate it extremely highly. However, without an objective analysis, all I have is my subjective perception. And subjective perceptions are not something I could credibly use in a workplace to encourage a switch. For that matter, subjective perceptions are not something I would consider acceptable for even telling a friend what to use. Perceptions are simply not credible and have no value in the real world.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
1. Re:As always... by greg1104 · 2010-09-20 15:27 · Score: 5, Interesting
  
  You've got the performance part backwards for PostgreSQL; it goes up with every release, sometimes a little, sometimes in a big way. See PostgreSQL history for a comparison covering versions 8.0 to 8.4. The mild regression in 8.4 shown there is actually reversible; it's mainly because a query related parameter for how many statistics to collect and use for query planning was increased by default. That results in better plans for most real-world queries, but it detuned this trivial benchmark a little bit. You can get performance back to 8.3 levels just by turning the parameter back to the "optimized for trivial queries" default of the older versions if you care about that. Most people prefer the new default. In the real world, 8.4 is actually faster due to improved handling of background VACUUM tasks too, which don't show up in simple benchmarks either.
  I'm the current lead architect on building a PostgreSQL Performance Farm to prevent regressions from popping into future versions of the code too. There is a recently completed beta client for that purpose. We're in the process of working out how to integrate into future development, starting with 9.1, so that potential regressions are spotted on a commit by commit basis. I haven't seen any performance regressions between 8.4 and 9.0, only moderate improvements overall and large ones in specific areas that were accelerated.
  Now, if you use some of the new replication features aggressively, that can add some overhead to slow down the master. But that's true of most solution; the data coming off the master has to take up some time to generate. The way PostgreSQL 9.0 does it is is pretty low overhead, it just ships the changed blocks around. Theoretically some statement based solutions might have lower overhead, but they usually come with concerns about non-determinism on the slaves when replayed (random numbers, timestamps, and sequence numbers are common examples).
  Given the non-disclosure terms of most of the closed source databases, nobody can publish benchmarks that include them without going through something like the TPC or SPEC process. The last time that was done in 2007, PostgreSQL 8.2 was about 15% slower than Oracle running the same database-heavy workload. And note that it was PostgreSQL 8.3 that had one of the larger performance increases, so that was from just before a large leap forward in PostgreSQL performance.
  At this point, Oracle and most other commercial databases still have a large lead on some of the queries run in the heavier TPC-H benchmarks. Links to more details as to why are on the PostgreSQL wiki. It just hasn't been a priority for development to accelerate all of the types of queries required to do well in that benchmark, and nobody so far has been willing to fund that or the subsequent certification via the TPC yet. Sun was the only one throwing money in that direction, and obviously the parts of that left within Oracle will no longer do so.