Slashdot Mirror


PostgreSQL 9.5 Does UPSERT Right (thenewstack.io)

joabj writes: For years, PostgreSQL users would ask when their favorite open source database system would get the UPSERT operator, which can either insert an entry or update it if a previous version already existed. Other RDMS have long offered this feature. Bruce Momjian, one of the chief contributors to PostgreSQL, admits to being embarrassed that it wasn't supported. Well, PostgreSQL 9.5, now generally available, finally offers a version of UPSERT and users may be glad the dev team took their time with it. Implementations of UPSERT on other database systems were "handled very badly," sometimes leading to unexpected error messages Momjian said. Turns out it is very difficult to implement on multi-user systems. "What is nice about our implementation is that it never generates an unexpected error. You can have multiple people doing this, and there is very little performance impact," Momjian said. Because it can work on multiple tables at once, it can even be used to merge one table into another.

20 of 105 comments (clear)

  1. Re: Too late by Anonymous Coward · · Score: 5, Funny

    Oh. How disappointing. I'll let the team know to remove the code that does this then seeing as you already have a solution.

  2. Re: Fucking finally. by Anonymous Coward · · Score: 5, Insightful

    PostgeSQL is one of those rare projects that delays a bit but makes sure that they plan their codebase ahead, and Get It Right The First Time.

    Nobody should ever whine about delays in projects that value correctness over first to market, especially in an open source project.

  3. Re:Cool by phantomfive · · Score: 2

    but that doesn't mean I want his bundt cake recipe

    Does he have a good one?

    --
    "First they came for the slanderers and i said nothing."
  4. Thanks a lot PostgreSQL devels by hrumph · · Score: 5, Interesting

    This is great. I've been using PosgreSQL for a while now. It's one of those pieces of software that just does what it's told and doesn't let you down. While I'm saying this there are credible rumours to the effect that the Oracle merge operation is broken. Read the comments to the most upvoted answer at this stack exchange question. The final comment is:
    Not reliably. I ended up with retry loops in the client code. :( – Randy Magruder Aug 27 '15 at 16:05
    This makes me think that Bruce Momijan may have been thinking about Oracle's implementation of merge when he said that other implementations were handled very badly.

    1. Re: Thanks a lot PostgreSQL devels by BigZee · · Score: 2

      Merge is not really the same as upsert. Merge (using the dual psuedo table) is a potential solution to the problem as are a number of options using pl/sql. I think the real truth here is that oracle haven't seen the need to implement an upsert command. Its not difficult to code and would be easy to build a solution without the need of a special command.

    2. Re: Thanks a lot PostgreSQL devels by Anonymous Coward · · Score: 4, Informative

      It is not that trivial, if you have to take into account race conditions and roll-back without using a global lock which would kill the performance. There are actually quite a few research papers just about this problem.

  5. Re:I'm curious by Anonymous Coward · · Score: 2, Informative

    Something like this :

    IF EXISTS (SELECT id FROM [table] WHERE id = :id)
    THEN
          UPDATE TABLE [table] .......... WHERE id = :id
    ELSE
          INSERT INTO [table] (.......) VALUES (...........)
    END IF;

  6. Re: Fucking finally. by Rei · · Score: 4, Informative

    Postgres is leading edge in some things, like (ridiculously useful) table inheritance. Their implementation could have been slightly better (the most common complaint being that indices don't inherit, you need to re-add them on all descendants), but I'm very glad that they were early adopters on this one. They even pulled off multiple inheritance well.

    --
    He's the sort of person who would sell the Red Cross to Dracula.
  7. Re:Too late by TheRaven64 · · Score: 2

    I hope that you're doing the check then insert inside a transaction.

    --
    I am TheRaven on Soylent News
  8. Re:I'm curious by hrumph · · Score: 2

    Your answer is naive and wrong. See this Stack Overflow page. The solutions on this page work but none of them have the elegance of the newly available INSERT ... ON DUPLICATE UPDATE method.

  9. Re:I'm somewhat by Lord+Crc · · Score: 4, Informative

    Is this summary saying that, prior to now, you could not update data that was in the database?

    No. In some cases you want to use a table like a key-value storage. If the key does not exist you need to insert a row with that key. Otherwise you'll want to keep only one row with that key, and just update the value.

    One can always do a "if key exists then update else insert", but the problem is that this is not atomic, because the "key exists" check is a separate statement from the insert or update statement. This can lead to issues if you have multiple connections accessing the same keys at the same time.

    The UPSERT allows you to do this as one atomic operation.

  10. Re:I'm somewhat by vikingpower · · Score: 3, Informative

    Lord Crc answered correctly. UPSERT = (INSERT iff not exists... else UPDATE ). The article is correct in the sense that other databases, especially open source one, do not always handle this one correctly. It was not part of "traditional" SQL and is rather new. Object-oriented and graph databases do not have any problems with such operators, as they're explicitly written to deal with this use case. For relational databases like PostgreSQL this is a harder one to get right. Whether PostgreSQL now really got it right, can only be proven by protracted use "out in the wild".

    --
    Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
  11. Re:I'm curious by halivar · · Score: 2

    The GGP asked for "traditional" SQL, for which the GP offered the correct answer. You offered another vendor-specific solution, not standard ANSI. The GP was in no way wrong. If he was doing retry loops, sure; that would be wrong.

  12. Oracle was there first by Ora*DBA · · Score: 2

    Oracle has included the 'merge' command for several iterations now, which does the same thing. It is a rich command - those interested can check the documentation at https://docs.oracle.com/database/121/SQLRF/statements_9016.htm#SQLRF01606

  13. Re:I'm somewhat by Lord+Crc · · Score: 4, Informative

    Most databases allow you to do transactions (BEGIN TRAN/COMMIT TRAN) that force it to be atomic.

    Sure, but in that case you might get errors when you try to commit if another connection has changed the value in the meantime, and you'll have to retry the whole thing. That's what UPSERT avoids.

  14. Re:Too late by MagicMerlin · · Score: 3, Informative

    That's how to do it. It has to be managed in a loop transaction side or in the application. This loop is remarkably difficult to get right by most programmers. For example, if a trigger is added to the inserted table after the fact causing an error in some other table, most loop based upsert code I've seen will fail into an infinite loop.

  15. Re:I'm curious by mrchaotica · · Score: 2

    In terms of "traditional" SQL, the GGP's answer is wrong because it fails to be atomic. The whole thing needs to be wrapped in a transaction.

    --

    "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

  16. Re:I'm curious by halivar · · Score: 3, Informative

    MERGE can combine multiple INSERT, UPDATE and DELETE operations into a single atomic operation.

    In PostgresSQL, Oracle, and SQL Server, MERGE is not atomic. Also, it is not UPSERT. You can use MERGE to accomplish the same end goal, but they are not synonymous and they do not work the same. Also, the syntax is idiosyncratic, and most extant implementations are problematic enough that MERGE is best avoided.

  17. Re:I'm curious by Zero__Kelvin · · Score: 2

    "What would be the equivalent long-hand in "traditional" SQL?"

    It doesn't exist. That's the point. This isn't implemented to save typing. It is true atomic insert or update.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  18. Re:Progress! by Qzukk · · Score: 2

    What's broken with the transaction isolation model? The only thing I see is that they don't do the non-transaction "read uncommitted" transaction that lets you see records that other transactions have not committed.

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.