Slashdot Mirror


What Is New In PostgreSQL 9.0

Jamie noted a blog that says "PostgreSQL 9.0 beta 2 just got released this week. We may see another beta before 9.0 is finally released, but it looks like PostgreSQL 9.0 will be here probably sometime this month. Robert Treat has a great slide presentation showcasing all the new features."

46 of 213 comments (clear)

  1. SQL! by Anonymous Coward · · Score: 5, Funny

    select FIRST_POST from slashdot where user='Anonymous Coward';

    1. Re:SQL! by medcalf · · Score: 3, Funny

      SQL ERROR: inconsistent indices

      --
      -- Two men say they're Jesus. One of them must be wrong. - Dire Straits
  2. If you got tired of clicking through articles by Qzukk · · Score: 4, Informative

    While the changelog is cool and all, if you just want to see the slides go to http://www.slideshare.net/xzilla/intro-to-postgres-9-tutorial

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  3. Join removal is cool by wandazulu · · Score: 2, Informative

    This bites me occasionally in Oracle where you've got a big query that has lots of tables joined together, and then at some point one of the columns is removed from the select part, and the query performance suddenly goes to hell. Then you have to go through and verify that each table is actually being used (even worse if the column that was removed from the select came from deep joins).

    Go Postgres!

    1. Re:Join removal is cool by interval1066 · · Score: 4, Insightful

      "Go Postgres!"

      Indeed, PostgreSQL is such a great system, in a lot of ways its better than mySQL; I'm constantly amazed by the number of orgs that have never heard of it.

      --
      Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
    2. Re:Join removal is cool by mcferguson · · Score: 2, Informative
      Actually, here are the entire conditions for join removal (from Robert Hass's blog):

      (1) it's a left join, (2) there is a unique index on all or a subset of the join columns, and (3) none of the attributes from the nullable side of the join are used elsewhere in the query

    3. Re:Join removal is cool by GooberToo · · Score: 4, Insightful

      Well, both the real and "trollish" answer is, MySQL has been trying to catch up for almost five years now - and doesn't look like their even close.

      PostgreSQL has been a better database for a long time now. The pull of MySQL isn't its technical prowess but its "dumbness." Simply put, MySQL provides a lot in exchange for very little. Its the go to database for people who have little DBA experience, don't know what makes for a good RDBMS, or is simply needing a database where ACID doesn't matter.

      Basically MySQL is popular because its the low hanging fruit. Its generally everywhere and most people who need a database don't know any better. So they've heard something about MySQL and its available with their hosting company. That's generally all they needed to know. Of course that completely ignores the fact that for most every project, PostgreSQL provides a vastly superior solution. The down side is, to use PostgreSQL vs MySQL in these cases, you'd have to read all of a dozen pages or so (actually far less, but lets play devil's advocate). And for most, that's simply far too much to ask.

      Its basically the lazy or ignorant DBA's database. Or a database where reliability doesn't matter. Or integrity isn't an issue. There certainly are places for those kinds of databases - its just that most who pick MySQL don't realize they've made those trade offs.

    4. Re:Join removal is cool by Dimes · · Score: 2, Interesting

      What you just stated sounds all well and good...but its not very informed. Or maybe just informed from 10 years ago.

      In general a database is pretty much as good as its DBA. That said, your statement about Postgre being vastly superior is strictly a contextual one, and even then would be a trade-off at best case.

      For the DBMS corner cases that MySQL doesn't do, there are some incredibly important things that Postgre doesn't do.

      I see in the updates that Postgre is finally doing Streaming Replication? Just now? Before this, you had to rely on 3rd party "hacks".

      Meanwhile, MySQL has done instant replication and failover for quite some time.

      More importantly, MySQL has done Master/Master replication for years now, natively, with relative ease. I have been doing Live Hitless upgrades to sites for years now, to infrastructures built on MySQL. Sites doing 5-10k questions per/sec. on minimal hardware(relatively). No Down Time. That is huge when it comes to "stability" of a website or web application when everything is DB dependent.

      Don't get me wrong, I like Postgre. I have for years. But don't go around talking the old line that MySQL is an severely incomplete DB for the ignorant. It has matured over the years into one of the most capable, feature complete, stable, and useful databases available.

      dimes

    5. Re:Join removal is cool by GooberToo · · Score: 2, Informative

      the Windows version didn't work as well as the Unix

      That was true back when PostgreSQL was a cygwin port; many, many years ago. For many years now PostgreSQL is a first rate sibling application on Windows. In fact, one of the killer performance combinations is 64-bit windows with 32-bit PostgreSQL - despite the fact 64-bit PostgreSQL is also available on Windows. The combination allows for 64-bit file cache from the OS (PostgreSQL heavily relies on the OS to provide robust file caching) and 32-bit binaries which further enhances the CPU's cache for executing instructions and caching data. Unless your queries require "64-bit datasets", performance is said to be excellent.

      Bluntly, anyone who tells you PostgreSQL is not a first rate Windows DB either has an ax to grind or is simply ignorant and in no position to be making such recommendations.In the end, you are absolutely correct. The lack of a proper PostgreSQL Windows port not only hindered general use adoption of PostgreSQL, but directly helped catapult MySQL's adoption, all those years ago. That fact was identified and is the reason why a first rate Windows port exists today.

  4. Re:In place upgrades by GooberToo · · Score: 2, Informative

    You're not alone. That issue is one of the last MySQL staples which PostgreSQL users hear about.

  5. Built-in replication by atomic777 · · Score: 4, Interesting

    A better summary of the changes is here.

    After years of resisting, one of the more significant changes is the inclusion of WAL shipping-based replication into postgresql core, and the ability to do read-only queries on the standby systems. This will hopefully go a long way towards appeasing mysql users used to the "easy" replication that mysql provides.

    1. Re:Built-in replication by BSAtHome · · Score: 4, Interesting

      And streaming replication. It makes it a lot easier to have a backup server that is up to date. It makes me happy so I can do partial restores without a lot of fuss.
      Now, if only the enterprise apps could find out to reconnect to the database and continue where they left off without crashing and trashing.

    2. Re:Built-in replication by atomic777 · · Score: 2, Insightful

      I see the replication feature as being more about perception than anything else.

      Postgresql has long had a variety of replication options outside of the core that serve various needs, but it seems that the perception out in the community remained that postgresql was a stable, stand-alone database, and getting replication to work on top involved "hacks", while mysql, despite its faults, had "solid" replication that lent itself better to large installations.

      Of course this perception is far from reality, but it has been deemed a serious enough problem for the postgres team to finally include replication in the core.

    3. Re:Built-in replication by atomic777 · · Score: 2, Insightful

      You've touched on the fact that data replication is a hard problem and all user scenarios can't be (sensibly) solved with a single solution. MySQL replication works well enough for the web crowd that has no idea what ACID stands for and its adoption has spread as a result. There being only one choice of replication solution also makes that an easy choice to make.

      Even being able to choose which replication solution to use with postgresql requires a substantial level of expertise. What postgresql has lacked, until now, is an "out of the box" solution that will be used by default, by the uninitiated, to get postgresql in the door. Then if they ever learn what ACID stands for, if they understand what asynchronous or synchronous replication is, they will be happy as hell that they didn't choose mysql way back when their whole site/business ran on only 2 servers.

  6. other then features... by jellomizer · · Score: 2, Insightful

    What I would love to see is some standardization for SQL languages. It would be nice to take an App say say in PostgreSQL then use it in Oracle if you find that you need to go to a bigger infrastructure... As well move down, as a lot of apps are running on DB servers that are too big for their use. While the language has some nice features it would be much better to have an updated set of common function and calls so you can make your SQL more cross platform. A lot of the real work behind SQL isn't much in the Language but in what is happening underneath.

     

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    1. Re:other then features... by schmiddy · · Score: 4, Informative

      You're basically describing the SQL language itself (PostgreSQL does a good job of implementing most of the various SQL standards up to SQL-92 and even SQL-99). Of course, add-on procedural languages like Oracle's PL/SQL aren't going to be supported as-is anytime soon on PostgreSQL, or anywhere else. And of course, each database vendor has their own extensions to the SQL language itself, which other vendors aren't always keen to copy (think MySQL's INSERT ... ON DUPLICATE KEY UPDATE, or PostgreSQL's CLUSTER).

      If you're designing a database application which you want to be easily portable across various SQL databases, just try to keep any non-standard-SQL use to a minimum and use of procedural languages simple and only when necessary. Books by Joe Celko stress this ideology, though my take is that it's just about impossible to really get the most out of your database unless you really make use of its extensions, which are there for a reason. For example, Celko discourages the use of auto-increment columns (serial type in Postgres, auto-increment primary key in MySQL), in favor of manually incrementing your sequence using MAX(pkey_column) or similar, which strikes me as absurd and non-scalable.

      --
      http://cltracker.net -- powerful craigslist multi-city search
    2. Re:other then features... by Kjella · · Score: 2, Informative

      They have tried. But the databases evolved so much faster than the language specification, especially when it comes to anything past plain SQL like triggers. Hell, even such a thing as automatic numbering is done differently in almost every database. Some things they just don't *want* to implement on ideological reasons, like "UPDATE OR INSERT" or "CREATE IF NOT EXISTS" in PostgreSQL at least. PostgreSQL is definitely on the better side of that though, Oracle is pretty much last so I don't know what to tell you, it'll never happen. It's more likely PostgreSQL will grow into an Oracle than that Oracle will ever support the standards as well as PostgreSQL does. By the way, one thing I've noticed with them is that they're very clear on pointing out what they don't support of the standard or if they do anything extra compared to the standard, that's *very* nice even if it's likely unportable anyway...

      --
      Live today, because you never know what tomorrow brings
    3. Re:other then features... by schmiddy · · Score: 3, Informative

      Some things they just don't *want* to implement on ideological reasons, like "UPDATE OR INSERT" or "CREATE IF NOT EXISTS" in PostgreSQL at least.

      Definitely not true. There's a lot of support to implement the MERGE command from the SQL standard. It's been proposed a few times, but it's more difficult than it sounds to implement. From here:

      And work on MERGE support is itself blocked behind the fact that PostgreSQL doesn't have a good way to lock access to a key value that doesn't exist yet--what other databases call key range locking. See the notes for "Add SQL-standard MERGE/REPLACE/UPSERT command" at http://wiki.postgresql.org/wiki/Todo for more information.

      Your gripe about CREATE ... IF NOT EXISTS might hold water, depending on what exactly you're complaining about (CREATE TABLE? Or for indexes/constraints?). There does seem to be some resistance to CINE, though from what I can tell it's mostly because people would rather have CREATE OR REPLACE, but COR is much harder to implement (what do you do when the object already exists, but is slightly different than the one you're trying to create)?

      --
      http://cltracker.net -- powerful craigslist multi-city search
    4. Re:other then features... by DragonWriter · · Score: 2, Informative

      For example, Celko discourages the use of auto-increment columns (serial type in Postgres, auto-increment primary key in MySQL), in favor of manually incrementing your sequence using MAX(pkey_column) or similar, which strikes me as absurd and non-scalable.

      Actually, while ISTR that Celko notes that using non-standard autoincrementing columns is non-portable and that you have to use other ways of doing so if you want code that is usable across different SQL backends, I'm pretty sure that he actually spends quite some time railing against the entire idea of using autoincremented counters, particularly in their common use as automagical keys which are then exposed to the end-user, and advocates, instead, constructing safe exposed identifiers with features like check digits in such use cases.

      I don't think there is any case in which Celko argues that an autoincrementing counter column is the right solution to a problem; the non-portability of the common features for these is, for him, one more stroke against using them, but not the main problem.

    5. Re:other then features... by XanC · · Score: 2, Informative

      I've never heard of one that doesn't make that guarantee. MySQL's certainly does.

    6. Re:other then features... by MobyDisk · · Score: 4, Informative

      Postgres actually does this... almost.

      First, unlike other SQL engines Postgres is language-independent. There is a plug-in system, and it already ships with a few different SQL variants.

      Second, the primary language is PL/PGSQL which is a clone of Oracle's PL/SQL. As a whole, Postgres started as an open-source Oracle clone. If you do a Google search, you will find several success stories of OraclePostgreSQL migrations because the stored procedure language is so similar.

      However, you are correct: there needs to be a standard. I see that someone posted and said "SQL is the standard" but that isn't good enough. Raw SQL doesn't provide control structures. There's no loops, no if-then-else, etc. As a whole, I like to avoid those, but they are inevitable.

    7. Re:other then features... by MobyDisk · · Score: 2, Informative

      add-on procedural languages like Oracle's PL/SQL aren't going to be supported as-is anytime soon on PostgreSQL

      Actually, PostgreSQL ships with PL/PGSQL which is pretty-much a clone of PL/SQL.

    8. Re:other then features... by butlerm · · Score: 3, Informative

      Actually, MySQL doesn't. At least not with InnoDB:

      "InnoDB uses the following algorithm to initialize the auto-increment counter for a table t that contains an AUTO_INCREMENT column named ai_col: After a server startup, for the first insert into a table t, InnoDB executes the equivalent of this statement:

      SELECT MAX(ai_col) FROM t FOR UPDATE;

      InnoDB increments by one the value retrieved by the statement and assigns it to the column and to the auto-increment counter for the table. If the table is empty, InnoDB uses the value 1"

    9. Re:other then features... by Anonymous Coward · · Score: 4, Funny

      Microsoft Access

      Get out.

    10. Re:other then features... by mvdwege · · Score: 2, Informative

      A sidenote here: the SERIAL datatype in PostgreSQL does not exist. It is merely a shorthand form to create an INTEGER column, a sequence, and assign the nextval() of that sequence as the default value to the column.

      Mart

      --
      "I know I will be modded down for this": where's the option '-1, Asking for it'?
    11. Re:other then features... by borgboy · · Score: 2, Interesting

      Above and beyond SQL-92/SQL-99, PostgreSQL does a good job of implementing the non-optional parts of SQL:2003 and SQL:2008 as well, and in that regard are competitive with or better than the commercial alternatives.

      PL/SQL is probably unlikely to ever be available in the Open Source PostgreSQL product, but it is a feature of EnterpriseDB, which is a PostgreSQL superset.

      INSERT..ON DUPLICATE KEY UPDATE is IIRC similar to the new SQL:2003 MERGE statement, which is on the TODO list for PostgreSQL.

      CLUSTER is a subset of Oracle's index-organized tables / SQL Server's clustered index features.

      One of the truly innovative features that is arriving is exclusion constraints. If you've ever had to implement a scheduling system that deals with concurrent updates, you'll recognize that PostgreSQL has an absolutely killer feature that makes it trivial to solve the concurrent range-excluded searched update problem without messy application code. This feature is pure gold.

      Designing a performance-intensive application that is portable across multiple databases is a frustrating, difficult task. Starting with ANSI/ISO syntax is indeed a great way to base your design, but the devil is truly in the details. ORMs can hurt as much as they help.

      --
      meh.
  7. Color me skeptical. by Estanislao+Mart�nez · · Score: 4, Interesting

    This bites me occasionally in Oracle where you've got a big query that has lots of tables joined together, and then at some point one of the columns is removed from the select part, and the query performance suddenly goes to hell. Then you have to go through and verify that each table is actually being used (even worse if the column that was removed from the select came from deep joins).

    Do you have an actual example? I simply don't see how removing a column from a select clause would make a query slower, unless you're talking about uses of aggregate functions, and even there I'm having a hard time seeing how a removal could make things worse.

    The thing that determines how much work the database has to do in order to produce the results is the FROM, the WHERE and the GROUP BY, because those are the ones that determine what's going to be accessed, joined, sorted and how. The SELECT (except for the use of aggregate functions) primarily just decides what information to present from the join results and how to present it.

    1. Re:Color me skeptical. by Anonymous Coward · · Score: 3, Informative

      you are basically correct, but yet still wrong. Certain optimizations use page reads to improve performance when returning a result set. It will try to access pinned/blocked columns before accessing trivial columns there by reducing collection times. Although these columns may not be indexed per se, they are ordered so retrieval pretty much sequential. Removing those pinned columns (either from the query or from thjs DB in general) may cause the DB to to start 'thrashing' and reduce the performance. This is another reason why vectored indexes are a good thing.

  8. Related Tangent by geoffrobinson · · Score: 3, Informative

    There's a new open-source database from one of the founders of PostgreSQL (Michael Stonebreaker): http://www.voltdb.com/

    I believe it is based on the H-Store project from MIT, and if it is anything like Stonebreaker's Vertica, should be similar in language and syntax to PostgreSQL.

    VoltDB should be for high-demand OLTP. It keeps everything in memory and is MPP (not to mention full-ACID compliance). It runs POSIX compliant unixes, even Mac OS 10.5 and later, Linux, etc. They only support CentOS (which is RHEL if memory serves).

    Anyway, if anyone is interested in PostgreSQL, I would take a look at this.

    --
    Except for ending slavery, the Nazis, communism, & securing American independence, war has never solved anything.
    1. Re:Related Tangent by bill_mcgonigle · · Score: 3, Informative

      Anyway, if anyone is interested in PostgreSQL, I would take a look at this.

      They don't really have similar use cases, but if you need a tight in-memory ACID database Volt might be just the thing. I think if you've ever been tempted to run sqlite on a ram disk, Volt is your baby. If you need high performance ACID and can afford lots of RAM, Volt probably makes you really happy.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    2. Re:Related Tangent by Ash-Fox · · Score: 2, Interesting

      It runs POSIX compliant unixes, even Mac OS 10.5 and later, Linux, etc.

      Sorry to say, but OS X is no where near POSIX compliance.

      To give one example, it can't even forking properly, it disallows a process to fork violates the POSIX standard. It does not forbid you to fork(), it demands you to fork() and exec() when it cannot guarantee you that the libraries you are using are safe from async-signal-safe. Guaranteeing your code can be forked even in a signal handler at any time is what POSIX demands. Trying to fork() without exec(), this is allowed by POSIX standards, however, if OS X cannot guarantee that the libraries in use are 'async-signal-safe' and this is not allowed, so it crashes the thread. There is tonnes of broken crap like this in OS X. Don't even get me started on the broken POSIX threading.

      Now, the funny thing is that Windows actually possess a POSIX, which is fully compliant while OS X's is not. Why isn't VoltDB available for it? I find this sort of thing stupid.

      --
      Change is certain; progress is not obligatory.
  9. Re:In place upgrades by Trifthen · · Score: 4, Informative

    Actually, there's a utility that works on 8.3 and above: pg_migrator, and isn't really that new. I wrote a long article on it a while ago that covers how we used it, and most of those instructions are not especially specific to our use case. Of course, before 8.3 you'll have to rely on a parallel restore (8.4's pg_restore client has a -j flag much like make, that will load several tables simultaneously, which drastically cuts migration time except for the initial dump.)

    All in all, it's a much better DB than it was in the 7.x days, and that's after the drastic improvements in the 8.x tree. I can't wait for 9.0.

    --
    Read: Rabbit Rue - Free serial nove
  10. I'd say something, but someone will freak out by Rallias+Ubernerd · · Score: 2, Interesting

    I don't understand. What is the advantage of PostgreSQL verses MySQL or a seperate HTTP server?

    Forget it. I know someone is going to freak out and mod me troll. I do not intend to cause harm.

    1. Re:I'd say something, but someone will freak out by coolgeek · · Score: 4, Informative

      It's a real database with ACID compliance designed in from the start, not as an afterthought.

      --

      cat /dev/null >sig
    2. Re:I'd say something, but someone will freak out by Yvan256 · · Score: 2, Funny

      If you want to freak out people, you say something like this:

      I don't use any database, I use plain XML text files.

    3. Re:I'd say something, but someone will freak out by hardburn · · Score: 3, Funny

      CSV? I have a team of slaves move stones around a field like a giant abacus.

      --
      Not a typewriter
    4. Re:I'd say something, but someone will freak out by Slashdot+Parent · · Score: 2, Funny

      What does an HTTP server have to do with a DB?

      PostgreSQL has a lot of more enterprisy features than MySQL.

      You mean like clustering?

      Oh, wait...

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    5. Re:I'd say something, but someone will freak out by Anonymous Coward · · Score: 5, Funny

      And I've been using Google Maps' satellite view to steal all your data. pwned!

  11. Re:In place upgrades by tcopeland · · Score: 4, Informative
  12. Still waiting by coolgeek · · Score: 2

    For the multi-master replication that was promised for 8.4.

    --

    cat /dev/null >sig
    1. Re:Still waiting by schmiddy · · Score: 2, Informative

      Eh? When was multi-master replication ever promised in core? You're probably thinking about hot standby -- the Streaming Replication/Hot Standby code which is the "killer feature" of 9.0 was originally slated for 8.4, but didn't make it in time. I'm really surprised there aren't more comments about SR/HS, as it's an awesome feature which lets Postgres compete with the big boys like Oracle.

      Imagine having your expensive database server be dedicated *only* to writes, and having all your read-only queries spread across one or more slave(s) which are also your backup servers. Pretty cool, huh?

      --
      http://cltracker.net -- powerful craigslist multi-city search
    2. Re:Still waiting by cxreg · · Score: 2, Interesting

      and now there's Postgres-XC, which looks very promising

    3. Re:Still waiting by schmiddy · · Score: 2, Informative

      And those queries return incorrect results when they are behind the master. Pretty cool, huh?

      Yup, that's the idea behind "asynchronous replication", and that's exactly how it's billed. If that's a deal-killer for you (for a large number of read-only queries it won't be.. think web applications where an eventually consistent state is perfectly fine) just wait until Postgres 9.1, when we should have more knobs for controlling master-slave replication. Then you should have the option to force the master to wait until a slave has received its WAL file update, and even fsync'ed it to disk, before your COMMIT returns, giving you the synchronous replication behavior you want.

      The tradeoff with synchronous replication is increased latency on the master of course, since you're hanging around waiting for the slave to fsync (or just receive the data, depending on how you turn the knob). There have even been some proposals floating around for a "quorum commit" model whereby you would tell the master that you want your COMMIT; to go through when n out of m slaves have received the data.

      --
      http://cltracker.net -- powerful craigslist multi-city search
  13. Re:when they have a tool like phpmyadmin by h4rr4r · · Score: 2, Informative

    Phppgadmin is pretty much that, there is also pgadmin which is a desktop app.

    I suggest you educate yourself before making such statements.

  14. Re:I like.... by 0racle · · Score: 2, Informative

    It got pushed off until later when the decision to integrate streaming replication in this release was made.

    --
    "I use a Mac because I'm just better than you are."
  15. No more slony? by XCondE · · Score: 2

    We have been using slony 1.x for master/slave replication with pg 8.3 and it has worked well for us. It does have its problems, specially lock issues when modifying schema for busy tables. I see that 9.0 includes built-in replication and a work-around for these situations (i.e.: kill read-only queries that are in the way).

    I wonder why this didn't make to their list of favourites. It is on the runner-ups though.