Slashdot Mirror


PostgreSQL 9.5 Does UPSERT Right (thenewstack.io)

joabj writes: For years, PostgreSQL users would ask when their favorite open source database system would get the UPSERT operator, which can either insert an entry or update it if a previous version already existed. Other RDMS have long offered this feature. Bruce Momjian, one of the chief contributors to PostgreSQL, admits to being embarrassed that it wasn't supported. Well, PostgreSQL 9.5, now generally available, finally offers a version of UPSERT and users may be glad the dev team took their time with it. Implementations of UPSERT on other database systems were "handled very badly," sometimes leading to unexpected error messages Momjian said. Turns out it is very difficult to implement on multi-user systems. "What is nice about our implementation is that it never generates an unexpected error. You can have multiple people doing this, and there is very little performance impact," Momjian said. Because it can work on multiple tables at once, it can even be used to merge one table into another.

8 of 105 comments (clear)

  1. Re:I'm curious by Anonymous Coward · · Score: 2, Informative

    Something like this :

    IF EXISTS (SELECT id FROM [table] WHERE id = :id)
    THEN
          UPDATE TABLE [table] .......... WHERE id = :id
    ELSE
          INSERT INTO [table] (.......) VALUES (...........)
    END IF;

  2. Re: Fucking finally. by Rei · · Score: 4, Informative

    Postgres is leading edge in some things, like (ridiculously useful) table inheritance. Their implementation could have been slightly better (the most common complaint being that indices don't inherit, you need to re-add them on all descendants), but I'm very glad that they were early adopters on this one. They even pulled off multiple inheritance well.

    --
    He's the sort of person who would sell the Red Cross to Dracula.
  3. Re:I'm somewhat by Lord+Crc · · Score: 4, Informative

    Is this summary saying that, prior to now, you could not update data that was in the database?

    No. In some cases you want to use a table like a key-value storage. If the key does not exist you need to insert a row with that key. Otherwise you'll want to keep only one row with that key, and just update the value.

    One can always do a "if key exists then update else insert", but the problem is that this is not atomic, because the "key exists" check is a separate statement from the insert or update statement. This can lead to issues if you have multiple connections accessing the same keys at the same time.

    The UPSERT allows you to do this as one atomic operation.

  4. Re:I'm somewhat by vikingpower · · Score: 3, Informative

    Lord Crc answered correctly. UPSERT = (INSERT iff not exists... else UPDATE ). The article is correct in the sense that other databases, especially open source one, do not always handle this one correctly. It was not part of "traditional" SQL and is rather new. Object-oriented and graph databases do not have any problems with such operators, as they're explicitly written to deal with this use case. For relational databases like PostgreSQL this is a harder one to get right. Whether PostgreSQL now really got it right, can only be proven by protracted use "out in the wild".

    --
    Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
  5. Re: Thanks a lot PostgreSQL devels by Anonymous Coward · · Score: 4, Informative

    It is not that trivial, if you have to take into account race conditions and roll-back without using a global lock which would kill the performance. There are actually quite a few research papers just about this problem.

  6. Re:I'm somewhat by Lord+Crc · · Score: 4, Informative

    Most databases allow you to do transactions (BEGIN TRAN/COMMIT TRAN) that force it to be atomic.

    Sure, but in that case you might get errors when you try to commit if another connection has changed the value in the meantime, and you'll have to retry the whole thing. That's what UPSERT avoids.

  7. Re:Too late by MagicMerlin · · Score: 3, Informative

    That's how to do it. It has to be managed in a loop transaction side or in the application. This loop is remarkably difficult to get right by most programmers. For example, if a trigger is added to the inserted table after the fact causing an error in some other table, most loop based upsert code I've seen will fail into an infinite loop.

  8. Re:I'm curious by halivar · · Score: 3, Informative

    MERGE can combine multiple INSERT, UPDATE and DELETE operations into a single atomic operation.

    In PostgresSQL, Oracle, and SQL Server, MERGE is not atomic. Also, it is not UPSERT. You can use MERGE to accomplish the same end goal, but they are not synonymous and they do not work the same. Also, the syntax is idiosyncratic, and most extant implementations are problematic enough that MERGE is best avoided.