Slashdot Mirror


PostgreSQL 9.0 Released

poet writes "Today the PostgreSQL Global Development Group released PostgreSQL 9.0. This release marks a major milestone in the PostgreSQL ecosystem, with added features such as streaming replication (including DDL), Hot Standby and other nifty items like DO. The release notes list all the new features, and the guide explains them in greater detail. You can download a copy now."

32 of 344 comments (clear)

  1. Thank you! by bjourne · · Score: 5, Insightful

    Congratulations to all the Postgres developers and a big thank you from me for an amazing job! Postgres is a wonderful RDBMS and one of the best free software projects there is. Rock on!

    1. Re:Thank you! by Anonymous Coward · · Score: 5, Informative

      Thirded. There is absolutely no reason for anyone to be using MySQL any more other than the old silly excuse "my hosting provider doesn't have anything else". PostgreSQL is now faster than MySQL in all but the most trivial of contrived cases, doesn't require you to choose between table types for different load types, is just as easy to use and install, has all the features that MySQL has and runs on a Windows server (for those idiots who think that is a good thing). Also, the PG community is vastly more helpful and knowledgeable than the rabble that is the MySQL user base.

      Finally, PostgreSQL is a proper independent open source project with a structure that all other open source projects should be judged by. MySQL has gone from hand to hand in the corporate world and has a future that is far from certain.

      Down with the joke that is MySQL, and down with all the idiots that make me work with it.

    2. Re:Thank you! by Chrisq · · Score: 4, Insightful

      Congratulations to all the Postgres developers and a big thank you from me for an amazing job! Postgres is a wonderful RDBMS and one of the best free software projects there is. Rock on!

      Apart from that it now really is just about the only alternative to Oracle or Microsoft.

    3. Re:Thank you! by TheRaven64 · · Score: 3, Funny

      Also, the PG community is vastly more helpful and knowledgeable than the rabble that is the MySQL user base

      You realise, I hope, that encouraging people to switch from MySQL to PostgreSQL will not improve this...

      --
      I am TheRaven on Soylent News
  2. Cool by iONiUM · · Score: 4, Interesting

    I read the notes, noticed the Column and WHEN triggers. Is this in other SQL databases? If it is, I haven't seen it before. In any case, it's pretty cool that you can setup triggers on a conditional statement. That would really help me out in a lot of scenarios, as I work in the BI space, so alerting is a big deal.

    1. Re:Cool by lanner · · Score: 4, Interesting

      I hate to say it, but good/useful features like that will be abused by stupid DB guys who can't program.

      Once upon a time I worked in the entertainment industry and was working on a big MMO game project.

      Company X could not scale up their game clusters past about 1000 players. Somewhere between 1000 and 2000 players, the game would just start bogging down and in-game events piled up and everything trainwrecked and was unplayable.

      So, it turns out that most of the game logic was built off of complicated SQL stored procedures, triggers, logic, etc. Basically, they were using their hard drive as a processor.

      The problem was with the MS-SQL server disk IO Wait. CPU was okay on all of the systems, but they could just not imagine that the disks in the database server (only one DB server per cluster) could be the source of the problems. Every time there was an item dropped, crafted, or certain other special things happened, there was an atomic commit and that basically required writing to disk on the spot. Get enough of that going and you're whole 20-something CPU cluster sits with idle CPU while the DB server works it's hard drives.

      Company went chapter 11, all staff eventually let go, and later was sold off for nothing.

      I had pointed out this problem to them, but it was late in development and when you tell the people who are responsible for designing the product that they are idiots, well, they behave like idiots and don't really listen. Not that they could have fixed it anyway due to time and intellect restraints.

      Anyway, point of the story is that cool SQL features are cool. But don't use your hard drive as a processor.

    2. Re:Cool by GooberToo · · Score: 4, Informative

      What's the difference between logic in the trigger determining when to issue the payload logic, and the logic outside the trigger...

      No! Its far more than syntactic sugar. Performance, readability, and maintainability are what this brings to the table.

      The difference between PostgreSQL's new column trigger feature and traditional triggers is they are only called when the column is modified rather than when any row is modified. This means, in many cases, the trigger will never be called and therefore, the DB isn't having to run at PL/SQL interpreter speeds during the execution of the trigger, to then determine there is nothing for it to do. Furthermore, a big headache which is extremely common to trigger code are IF/THEN/ELSE or long CASE statements to determine which columns are modified, or to determine if the trigger even cares that the row in question (example, columns which the trigger doesn't care about) has been modified.

      The above combination of traditional triggers means lots of overhead, lots of needless PL/SQL code execution, and hard to read/maintain triggers for non-trivial actions. Whereas with the new feature, you can now have a single trigger relate to specific column, which is only ever executed when the trigger should actually execute. Its a win, win, win for all PostgreSQL users. And best of all, this means you can have smaller triggers when you need to perform different actions based on different column changes.

    3. Re:Cool by caerwyn · · Score: 4, Insightful

      Business logic never belongs in the DB. Even triggers are suspect. They can be horribly inefficient.

      The fact that triggers *can* be inefficient is no reason not to use them when there's a good implementation and competent DBAs to make sure they *aren't*. Also, business logic never belongs in the DB? To the contrary- a lot of business logic is sets of rules to maintain consistency between various things. That sort of logic is *precisely* what belongs in the DB, rather than scattered throughout a variety of applications running on top of it.

      --
      The ringing of the division bell has begun... -PF
    4. Re:Cool by GooberToo · · Score: 4, Interesting

      There is a difference between the engine checking a constraint versus a call into an interpreted language. One is doing less work. The other is doing more work. Which is ideal? Obviously less work is better. And that's before you even get into the PL/SQL code which is essentially doing the same work, but slower. Furthermore, all too often, triggers are called when there is no work to be done but you don't know that until the PL code decides this is the type of row change its interested, otherwise it should have really been a NOP. Whereas with the column trigger, the call to the PL code simply never takes place. So we not only save on the call but all of the wasted time inside of a trigger which ultimately decides its has nothing to do.

      Also, when the trigger should be called, in a row trigger, triggers frequently must evaluate which columns have changed before it can even determine if it cares about this row. Should it then decide it does care about this row, likely you've already passed through a mass of CASE and/or IF/THEN/ELSE codes, which ultimately states yet more CASE and/or IF/THEN/ELSE to determine exactly what it should now be doing now that its decided it does need to process this row. Or, you can call a much smaller section of code which is dramatically simplified because one, its only called when its pre-qualified (saving the creation of much redundant code) and two, since its now pre-qualified, we can immediately get to performing whatever logic the trigger in question should do when the column in question has changed.

      Those are worlds apart in performance, readability, maintainability. Not to mention the added granularity makes possible a reduction in the test matrix, regression tests, and even makes it more difficult (though far from impossible) to create a regression.

    5. Re:Cool by Mike+Da.+Kristopeit · · Score: 3, Insightful
      you're wrong. you can't even make your recommendation with a misspelling... i believe you meant "debug once" as your next sentence was "test once"... but i guess i'll have to wait for your ever powerful regression testing engineers to finish their test case implementations before i can test my assertion.

      you just created a job! perhaps an entire department! great idea! well, for people that need jobs... not for people that need to pay to get something implemented correctly done for a price, or developers that take pride in the the perfection of their work and require no such additional external verification or confirmation.

    6. Re:Cool by h4rm0ny · · Score: 3, Insightful

      I can imagine the headlines now: "Two DBA's shot each other dead yesterday morning after they fell out over the maintainability of column-based conditional triggers. A police officer at the scene remarked: 'If only they had been using MySQL with the default ISAM tables which support no such functionality, then none of this would have happened."

      I love the arguments in C++ and DB stories. They're so much better than watching two people slug it out over a football team in a bar. :D

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    7. Re:Cool by Dalcius · · Score: 3, Insightful

      The problem wasn't that the databse was used in a wrong way.

      While the database platform may have been a problem, the first, biggest problem wasn't the hardware. It was (bolded for effect): business logic does not belong in your DB, excepting a handful of cases.

      I wouldn't normally have replied -- I agree with your point about hardware -- but this deserves to be underscored because so many people fundamentally do not get it.

      I work for a company that was just bought out. Our new parent company develops a well known (in its industry) enterprise desktop application. The entirety of their business logic is written in the DB.

      It's a maintenance nightmare, difficult to integrate with outside systems, and the system does not scale. Scalability is kicking their ass... because they can't.

      Our company spends much less on hardware (and software, cheers for PostgreSQL), and our application is a billion times easier to develop and maintain. We use triggers - but only a handful.

      People develop middle layers for a reason. Better code organization and flow control, much, much better speed, scalability, and flexibility.

      Use a DB for what it's good at (ACID, data integrity).

      --
      ~Dalcius
      Rome wasn't burnt in a day.
    8. Re:Cool by Dalcius · · Score: 3, Insightful

      A sincere thank you for the reply.

      Respectfully, I believe you're going down a bad road. For a small application, and a developer who isn't asleep at the wheel, what you're suggesting can work OK; I imagine it hasn't been a problem for you.

      Generally, for everything else, it tends to blow up down the road. And even for smaller projects, the benefits of keeping your logic out of the DB (faster, cleaner, clearer, easier to learn, easier to update) are significant.

      I sincerely want to make an effort to be useful to folks who may not have worked with larger projects before. Please take this post in that spirit. It's a long, and I hope informative, post.

      Make your code's intent clear ... alternatively, write logic in the right context
      The biggest problem is code intent and flow control. Databases model data, GUIs model user actions, but neither necessarily model business logic. You've got business logic code for that -- like your permissions handling.

      Code has a logic-centric view. Databases have a data-centric view. If you write logic in the DB, you're abstracting what you're writing from what you're actually trying to accomplish.

      Error handling is a great example for flow control. Trapping an error in the DB, correctly identifying the problem and bubbling it up in a user-appropriate way in a clean fashion, as you said, is difficult.

      Keep your code together
      Another big problem is finding all the right code. For a newbie or a dev who hasn't looked at a bit of code in 6 months, this is very important.
      Breaking your code up into different buckets based on what data it modifies makes your logic - your flow - hard to follow for anyone who isn't intimately familiar with the project.

      Consider two implementations for a feature:

      1) Application code triggers an update to three tables. A bunch of magic spread across several triggers on those three different tables and written in (depending on your implementation) three different schema files, or one huge file with your entire schema, updates three additional tables. E.G.: "most complex updates are performed using triggers and rules"

      2) One function in one file that, in a transaction, updates six tables.

      Which is easier to follow? From which implementation can you get a clear vision of the intent of the code? Which is easier to skim? If a new developer looks at your code, in which implementation is he more likely to miss something the feature touches?

      Don't make interoperability harder
      If you push logic into your DB, interoperability becomes harder, not easier. Before, you had just data, and two different applications could use it differently and have different requirements -- as long as what they did was consistent with the data model (your schema).

      Now, with logic in the DB, anyone wanting to share your data has to work around your triggers, or co-opt them, or write entirely new triggers, procedures, and generally muck up your application to accomplish their goals.

      Keep your platform flexible
      Pushing logic into the DB, you may have made it easier to switch languages (although you still have a bunch of "front end" code you'd have to convert/maintain), but now it is much harder to switch databases. What was once merely data is now jumbled up with your application. If Oracle, MSSQL, etc. becomes a necessity, you're not just switching databases, you'll likely be refactoring significant chunks of your app.

      Most business needs won't demand you switch languages (worst case, cross-compile or use IPC). Some business needs will demand you switch databases -- my company is now porting our app to MSSQL.

      Your database is likely your bottleneck
      In my experience, the first serious scalability problems most apps encounter are in the database. And the solution is always "more hardware." Beefier machines.

      Then comes horizontal scaling ("moar serve

      --
      ~Dalcius
      Rome wasn't burnt in a day.
  3. Re:"Great leap forward" by catmistake · · Score: 4, Insightful

    LMAO... unless my sarcasm detector is malfunctioning, comparing Postgres to MySQL is extraordinarily absurd... like comparing megaliths to legos.

  4. Re:Has the Documentation Been Improved? by caerwyn · · Score: 4, Informative

    Err, have you actually used the PostgreSQL manual? It's one of the best manuals I've ever seen for a software product.

    --
    The ringing of the division bell has begun... -PF
  5. Re:"Great leap forward" by Anonymous Coward · · Score: 5, Informative

    Yes, but MySQL has a shoddy parser (needs a space after the -- comment tag), poor trigger failing (you have to do a kludge and dump a varchar into an int to get it to fail), apparent lack of direction (how many forks and engines?!), no CTE support and the list goes on. I am constantly banging my head against a wall with MySQL. I use MSSQL for work, Postgres at home and MySQL on hosting.
    I am truly surprised that most web hosting companies do not offer Postgres. Postgres also allows writing of DB functions in C, Java, PHP etc. like Oracle, which is useful for bundling code into the DB (making the DB the application) without everyone having to see your SQL source for functions. It is also licensed on BSD which is good for using their libpq library in commercial apps; MySQL's C API is GPLd or licensed expensively from Oracle, although there are moves toward making it free for use in commercial apps (as far as I can tell from the mishmash of info coming from their sales rep via email).
    Also, as far as I know, MySQL puts all of its indexes in memory for replication which is a problem if the node goes down. Can anyone enlighten me?

    In any case, well done to the Postgres team. Not only is their software package neat, their documentation is some of the best I have ever seen.

  6. Re:Waiting for a capable PostgreSQL front-end by TheFuzzy · · Score: 3, Informative

    Why not just use .NET with PostgreSQL? You can put whatever you want on the back end.

    Or you could use Once:Radix or Servoy, both of which integrate with PostgreSQL.

    https://sourceforge.net/projects/onceradix/
    http://www.servoy.com/

  7. As always... by jd · · Score: 5, Interesting

    The new features are much admired by all (and deservedly so), but a heavier footprint typically means poorer performance overall even if there's accelerated performance in specific areas or improved programming. I'd like to see a performance plot, showing version versus performance versus different types of system load, in order to see how well new stuff is being added in. It might be merged in great and the underlying architecture may be superb, but I would like to see actual data on this.

    Also, PostgreSQL and MySQL aren't the only Open Source SQL databases. Including variants and forks, you really need to also consider Ingres, Drizzle, MariaDB, SAP MaxDB, FireBird and SQLite. If you want to also compare against Closed Source DBs, then you'd obviously want to look at DB/2, Oracle, Cache and Sybase. I'd love to see a full comparison between all of these, feature-for-feature, with no bias for or against any specific development model or database model, but rather an honest appraisal of how each database performs at specific tasks.

    I like PostgreSQL a lot. I rate it extremely highly. However, without an objective analysis, all I have is my subjective perception. And subjective perceptions are not something I could credibly use in a workplace to encourage a switch. For that matter, subjective perceptions are not something I would consider acceptable for even telling a friend what to use. Perceptions are simply not credible and have no value in the real world.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:As always... by jwpye · · Score: 3, Interesting

      I think there has been some effort to bring a PostgreSQL "performance farm" online to show the differences in performance across versions of PostgreSQL, and to quickly identify regressions during development. I don't think it's up yet, but a search should reveal some details on the project.

    2. Re:As always... by greg1104 · · Score: 5, Interesting

      You've got the performance part backwards for PostgreSQL; it goes up with every release, sometimes a little, sometimes in a big way. See PostgreSQL history for a comparison covering versions 8.0 to 8.4. The mild regression in 8.4 shown there is actually reversible; it's mainly because a query related parameter for how many statistics to collect and use for query planning was increased by default. That results in better plans for most real-world queries, but it detuned this trivial benchmark a little bit. You can get performance back to 8.3 levels just by turning the parameter back to the "optimized for trivial queries" default of the older versions if you care about that. Most people prefer the new default. In the real world, 8.4 is actually faster due to improved handling of background VACUUM tasks too, which don't show up in simple benchmarks either.

      I'm the current lead architect on building a PostgreSQL Performance Farm to prevent regressions from popping into future versions of the code too. There is a recently completed beta client for that purpose. We're in the process of working out how to integrate into future development, starting with 9.1, so that potential regressions are spotted on a commit by commit basis. I haven't seen any performance regressions between 8.4 and 9.0, only moderate improvements overall and large ones in specific areas that were accelerated.

      Now, if you use some of the new replication features aggressively, that can add some overhead to slow down the master. But that's true of most solution; the data coming off the master has to take up some time to generate. The way PostgreSQL 9.0 does it is is pretty low overhead, it just ships the changed blocks around. Theoretically some statement based solutions might have lower overhead, but they usually come with concerns about non-determinism on the slaves when replayed (random numbers, timestamps, and sequence numbers are common examples).

      Given the non-disclosure terms of most of the closed source databases, nobody can publish benchmarks that include them without going through something like the TPC or SPEC process. The last time that was done in 2007, PostgreSQL 8.2 was about 15% slower than Oracle running the same database-heavy workload. And note that it was PostgreSQL 8.3 that had one of the larger performance increases, so that was from just before a large leap forward in PostgreSQL performance.

      At this point, Oracle and most other commercial databases still have a large lead on some of the queries run in the heavier TPC-H benchmarks. Links to more details as to why are on the PostgreSQL wiki. It just hasn't been a priority for development to accelerate all of the types of queries required to do well in that benchmark, and nobody so far has been willing to fund that or the subsequent certification via the TPC yet. Sun was the only one throwing money in that direction, and obviously the parts of that left within Oracle will no longer do so.

  8. Re:Has the Documentation Been Improved? by jwpye · · Score: 5, Informative

    hrm? The documentation is regularly updated... http://www.postgresql.org/docs/9.0/static/

  9. Re:Firebird is better by h4rr4r · · Score: 3, Insightful

    Please tell me you're kidding. Anyone suggesting MySql for real work should just be laughed at.

  10. Re:Waiting for a capable PostgreSQL front-end by MichaelSmith · · Score: 3

    then has to be replaced with real code and a real DB.

    Oddly enough I spoke to somebody last weekend who makes her living doing exactly that. I suspect that without access to kick these projects off she would have less work overall.

    access -> sqlite -> mysql -> postgres -> oracle

    Everybody looks down on the tools to the left.

  11. Re:Has the Documentation Been Improved? by shutdown+-p+now · · Score: 4, Informative

    It's actually one of the best manuals for SQL in general - at least, in my experience, it has the most clear explanations of many more advanced SQL constructs that are common between various RDBMSes.

  12. I guess we really are the leader now by TheFuzzy · · Score: 4, Funny

    PostgreSQL *must* be the leading open source SQL database, now. People are bashing us on Slashdot. That's always a sign of success.

    Thanks, guys!

    --Josh Berkus
        PostgreSQL contributor

  13. Re:Waiting for a capable PostgreSQL front-end by guusbosman · · Score: 5, Informative

    You know that you can point your MS Access client to any supported back-end right? Just create an ODBC connection on your Windows machine to your PostgreSQL server and you can use Access with pretty much all the features that work for the Microsoft JetEngine (PostgreSQL has ODBC drivers here; http://www.postgresql.org/ftp/odbc/versions/)

    Earlier this year we converted a huge Access application from MSSQL to PostgreSQL and the technical conversion, using ODBC to PostgreSQL instead of connecting to MSSQL, was a piece of cake.

  14. Re:Meh by Daniel_Staal · · Score: 4, Informative

    Um, yeah. MySQL, out of the box, using the defaults, doesn't support foreign keys now. You have to specifically create the tables with a non-standard SQL code to get them to use the right database backend to get foreign key support.

    Unless you mean by 'support' 'Will silently accept and throw away'...

    Foreign keys have been enabled and working by default in Postgres since version 7. (There was no version 5...) That was released just over ten years ago at this point.

    --
    'Sensible' is a curse word.
  15. Re:"Great leap forward" by Alex+Zepeda · · Score: 3, Informative

    | Please tell me you're kidding. Anyone suggesting MySql for real work should just be laughed at.

    I'm not sure how you got modded to +5 with this statement, but your statement is uninformed and completely false. While MySQL isn't in the class of Oracle for HA, MySQL with InnoDB is damn well is a competent database and I don't just mean for LAMP.

    Eh. It's *okay* for lightweight work where you don't care about data integrity or don't add or modify a lot of data. Beyond that it falls apart quickly.

    At a previous job we used ActiveRecord hooked up to MySQL to handle an influx of temporal data that was meant to be quickly processed and usable for reading back in real time. ActiveRecord uses sequences (so, auto increment fields in MySQL -- since proper sequences are lacking in MySQL) for the primary key. With Postgres this is not a problem at all. InnoDB, OTOH, locks *the entire table* to update an auto increment field. The sysadmin/dba was averse to using Postres, so the result was a series of complex and tedious to debug performance problems and queues. We spent countless hours dancing around the performance problems inherent to table level locking.

    Of course we could have gone with MyISAM... but data integrity was important. There were other seemingly basic features that were lacking in MySQL (timezone support and a useful explain command come to mind). As far as I can tell there aren't a lot of good reasons to actively choose MySQL. The lightweight cases are well handled by SQLite, and the heavier stuff will almost certainly benefit from what Postgres has to bring to the table.

    --
    The revolution will be mocked
  16. Re:I think Sun was giving them access to some... by greg1104 · · Score: 3, Informative

    The servers Sun supplied that Oracle recently yanked were for the regular PostgreSQL build farm, used to run basic regression tests. They've since been replaced, the project is unmoved. As I mention in more detail in my upthread post, work on the PostgreSQL performance farm continues unaffected by that. It is expected that some build farm machines will also run the performance farm client periodically too, that's the only overlap there ever was between the two pieces of work. If Oracle still had hardware in the build farm it could have been used for performance tests too eventually. But they don't, and we in the PostgreSQL community don't care; we don't need their contributions.

  17. Re:"Great leap forward" by caerwyn · · Score: 4, Insightful

    Part of the reason MySQL gets treated as a toy is its release discipline- or lack thereof. At least one of the 5.x releases came out with *known* data-loss bugs; that's just not even remotely acceptable in a database, and that's the sort of impression that's hard to shake: people aren't just going to look at subsequent releases and go "oh, well, they say they're paying more attention this time, I guess that's good enough".

    --
    The ringing of the division bell has begun... -PF
  18. Re:Firebird is better by GooberToo · · Score: 3, Interesting

    Its extremely ironic that you changed just as PostgreSQL become considerably faster than MySQL. PostgreSQL has always been far more scalable. To now hear you brag that you've never looked back at a superior and faster database because of your steadfast and likely false belief that MySQL is faster, is rather amusing.

    One of the biggest problems with the MySQL user base is that they don't have any idea what "faster" means nor do they typically understand how to benchmark. Made worse, they constantly confuse speed with scalability. And made ever worse, most MySQL users take the MySQL benchmarks to heart when time and time again they are nothing but marketing lies. Most independent tests have historically had lots of problems even getting MySQL to stay running until the end. And when it actually does finish, its normally somewhere between the middle of the pack to dead last - and that's with all the other databases forced to use the lowest common feature which prevents them from using their advanced, much, much faster features.

    The bottom line is, MySQL is popular because it has buzzword compliance for people who almost always don't know any better; but most of all, was readily available on Windows at a time when everyone was looking for a free database to go to. PostgreSQL is popular because it has both buzzword compliance, is far more feature rich, almost always out performs MySQL, and underscores, not to mention truly understands, what ACID is all about - while providing a very rich set of features which MySQL is unlikely to ever match. And that's ignoring that MySQL's optimizer absolutely stinks for anything but the most simple of queries.

    The best rule of thumb is, think of MySQL as a really fast Access database. If you wouldn't use Access, ignoring database performance in the comparison, you should think really hard about using MySQL. There are so many superior and still free RDBMs compared to MySQL, its easy to see why so many get so frustrated when others insist on injecting an dramatically inferior solution into the equation, just because it has buzzwords.

  19. Re:"Great leap forward" by georgeb · · Score: 3, Interesting

    I will second that. I remember a discussion with a mysql dev where I was trying to raise the point that the db should not accept 30 feb as a valid date. A quite senior dev, backed up by numerous voices on the mailing list, was trying to convince me that it wasn't the db's job to check for that, quoting performance concerns. This was at least 10 years ago. But nonetheless, it's not a good start for a DB to have core developers like that. I don't like MySQL primarily because it cares about standard SQL just about as much as Microsoft does. I find the documentation to be abhorring. DDL is cumbersome. Working in commandline mysql is pure torture compared to psql (on linux! psql for windows is rubbish; not surprising considering the mess that windows commandline terminal is).