Mixed Feelings About MERGE

I have a love-hate relationship with the MERGE SQL statement.
Well, maybe it’s a bit harsh. Let’s make it a like-dislike relationship.

Why I Like It

Because it provides an elegant way to perform an “insert if not exists, update if exists” logic in a single statement. It has a rich syntax with various options for different use cases. One of the very first posts I wrote in this blog, in 2013, was a primer of the MERGE statement.

Why I Dislike It

I don’t like how it deals with concurrent transactions that try to handle the same key.
Let me show you what I mean exactly.

Demo Setup

I have a table t, with a primary key on the id column.

23ai> create table t (
  2    id integer not null
  3      constraint t_pk primary key,
  4    data varchar2(100),
  5    last_operation varchar2(10)
  6  );

Table created.

23ai>
23ai> insert into t (id,data)
  2    select rownum,'init'
  3    from dual
  4    connect by level<=5;

5 rows created.

23ai> commit;

Commit complete.

I want a procedure that gets two input parameters – p_id and p_data – and either inserts a new record into t (if there is no record with that id already in the table) or updates an existing record (if there is a record with that id already).
I will write two procedures – one that uses the MERGE statement and one that uses the INSERT and UPDATE statements.

23ai> create procedure with_merge (
  2    p_id in t.id%type,
  3    p_data in t.data%type
  4  ) as
  5  begin
  6    merge into t
  7    using dual
  8    on (t.id = p_id)
  9    when matched then
 10      update set data = p_data,
 11                 last_operation = 'update'
 12    when not matched then
 13      insert (id,data,last_operation)
 14      values (p_id,p_data,'insert');
 15  end with_merge;
 16  /

Procedure created.

23ai>
23ai> create procedure without_merge (
  2    p_id in t.id%type,
  3    p_data in t.data%type
  4  ) as
  5  begin
  6    insert into t (id,data,last_operation)
  7    values (p_id,p_data,'insert');
  8  exception
  9    when dup_val_on_index then
 10      update t
 11      set t.data = p_data,
 12          t.last_operation = 'update'
 13      where t.id = p_id;
 14  end without_merge;
 15  /

Procedure created.

In the without_merge procedure we try to insert a record into t. If we get the dup_val_on_index exception (which means that t_pk – the only unique index that we have – already contains an entry for the p_id value), then we update the existing record.

Note that this is a simple implementation that does not cover all the edge cases.

Comparison

I would expect both procedures to behave the same way, and most of the time they do:

23ai> begin
  2      with_merge(1,'with_merge');
  3      with_merge(6,'with_merge');
  4      without_merge(2,'without_merge');
  5      without_merge(7,'without_merge');
  6      commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

23ai> select * from t order by id;

        ID DATA                           LAST_OPERA
---------- ------------------------------ ----------
         1 with_merge                     update
         2 without_merge                  update
         3 init
         4 init
         5 init
         6 with_merge                     insert
         7 without_merge                  insert

7 rows selected.

Everything looks good.

But now let’s compare what happens when two sessions try to add a new record with the same id.
First, let’s use the without_merge procedure.

--
-- In session 1 
--
23ai> exec dbms_application_info.set_client_info('Session 1')

PL/SQL procedure successfully completed.

23ai> exec without_merge(8,'without_merge in session 1')

PL/SQL procedure successfully completed.

A new record was added, but we didn’t commit yet, so it is not visible to other sessions.

Now, from another session we call the procedure with the same id:

--
-- In session 2
-- 
23ai> exec dbms_application_info.set_client_info('Session 2')

PL/SQL procedure successfully completed.

23ai> exec without_merge(8,'without_merge in session 2')

Session 2 is now blocked by session 1, which locks the unique index entry for id 8. Session 2 waits to lock the same entry, as part of the INSERT statement.

23ai> break on client_info skip 1
23ai> col client_info head "Client|Info" format a10
23ai> col type        head "Type"        format a4
23ai> col lmode       head "Mode|held"   format a4
23ai> col rmode       head "Mode|req"    format a4
23ai> col obj         head "Obj or Tx"   format a55
23ai>
23ai> select
  2    s.client_info,
  3    l.type,
  4    decode (l.lmode,  1,'-',2,'RS',3,'RX',4,'S',5,'SRX',6,'X') lmode,
  5    decode (l.request,1,'-',2,'RS',3,'RX',4,'S',5,'SRX',6,'X') rmode,
  6    case
  7      when l.type='TM' then (select o.owner||'.'||o.object_name from dba_objects o where o.object_id=l.id1)
  8      when l.type='TX' then to_char(trunc(l.id1/power(2,16)))||'/'||to_char(bitand(l.id1,to_number('ffff','xxxx')))||'/'||l.id2
  9    end obj
 10  from
 11    v$lock l,
 12    v$session s
 13  where
 14    l.sid = s.sid and
 15    s.username = USER and
 16    l.type in ('TM','TX')
 17  order by
 18    1,
 19    2;

Client          Mode Mode
Info       Type held req  Obj or Tx
---------- ---- ---- ---- -------------------------------------------------------
Session 1  TM   RX        DEMO2.T
           TX   X         3/23/91390

Session 2  TM   RX        DEMO2.T
           TX   X         2/28/49946
           TX        S    3/23/91390

What will happen when session 1 commits?

--
-- In session 1
--
23ai> commit;

Commit complete.

Once the transaction in session 1 is committed, the INSERT statement in session 2 fails due to the unique index, and the exception handler performs the UPDATE statement.

--
-- In session 2
--
PL/SQL procedure successfully completed.

23ai> commit;

Commit complete.

23ai> select * from t order by 1;

        ID DATA                           LAST_OPERA
---------- ------------------------------ ----------
         1 with_merge                     update
         2 without_merge                  update
         3 init
         4 init
         5 init
         6 with_merge                     insert
         7 without_merge                  insert
         8 without_merge in session 2     update

8 rows selected.

This is the expected behavior. The final state is the same as if the two transactions were serialized (session 1 calls the procedure and commits, and only then session 2 calls the procedure).

Now let’s repeat the test with the MERGE version.

--
-- In session 1 
--
23ai> exec with_merge(9,'with_merge in session 1')

PL/SQL procedure successfully completed.

A new record was added, but we didn’t commit yet, so it is not visible to other sessions.

Now, from the other session we call the procedure with the same id:

--
-- In session 2
-- 
23ai> exec with_merge(9,'with_merge in session 2')

Session 2 is now blocked by session 1. Same as in the previous test.

Client          Mode Mode
Info       Type held req  Obj or Tx
---------- ---- ---- ---- -------------------------------------------------------
Session 1  TM   RX        DEMO2.T
           TX   X         3/30/91369

Session 2  TM   RX        DEMO2.T
           TX   X         2/33/49941
           TX        S    3/30/91369

What will happen when session 1 commits the transaction?

--
-- In session 1
--
23ai> commit;

Commit complete.

Once the transaction in session 1 is committed, the INSERT statement in session 2 fails due to the unique index, but unlike in the without_merge case, the MERGE statement fails:

--
-- In session 2
--
BEGIN with_merge(9,'with_merge in session 2'); END;

*
ERROR at line 1:
ORA-00001: unique constraint (DEMO2.T_PK) violated on table DEMO2.T columns (ID)
ORA-03301: (ORA-00001 details) row with column values (ID:9) already exists
ORA-06512: at "DEMO2.WITH_MERGE", line 6
ORA-06512: at line 1
Help: https://docs.oracle.com/error-help/db/ora-00001/

23ai> select * from t order by id;

        ID DATA                           LAST_OPERA
---------- ------------------------------ ----------
         1 with_merge                     update
         2 without_merge                  update
         3 init
         4 init
         5 init
         6 with_merge                     insert
         7 without_merge                  insert
         8 without_merge in session 2     update
         9 with_merge in session 1        insert

9 rows selected.

This is wrong, in my opinion.

This example demonstrates that the MERGE statement first of all checks if the record exists. If it doesn’t find such a record, it decides to perform an INSERT, and it will not change its mind even if the INSERT fails (as at this point it turns out there actually is an existing record).

5 thoughts on “Mixed Feelings About MERGE”

    1. Thanks Connor.
      Yes, that’s what I meant by “it will not change its mind” (i.e., it will not do a restart) 🙂
      My main point is that it is important to be familiar with this behavior, so you know when it is safe to use MERGE (and there are plenty of such cases), and when you need to use alternatives.
      Thanks for the reference to Andrej’s work!
      I’m glad to hear about the “work in progress”. I know this behavior for so many years, that I was skeptic that it would ever change.
      Thanks,
      Oren.

  1. Hi Oren,
    Your saying that “MERGE will not change its mind” is in fact equivalent to saying that,
    in such a case, MERGE does not perform a “restart”, similar to the one that happens in some cases with an UPDATE and, slightly differently, with the UPDATE branch of a MERGE.

    Maybe this is because restarts were never related to an INSERT, but only to UPDATE and DELETE ?
    Since MERGE is in fact a mix of the three DML statements, maybe there is place for such an enhancement, namely to perform a “restart” + “branch change” in cases like your example ?

    This reminded me of some nice blog posts of Andrej Paschenko, where he analyzes the different behavior of restart between an UPDATE and an UPDATE branch of a MERGE.

    https://blog.sqlora.com/en/merge-and-ora-30926-part-2-or-differences-in-write-consistency-between-update-and-merge/

    Cheers & Best Regards,
    Iudith

    1. Iudith, Oracle provides Repeatable Read at statement level by restarting the statement when detecting a row that changes. But it doesn’t provide Serializable because it has no conflict detection on inexisting rows (no range lock), so no restart for an insert.

      Oren, I think the semantic of MERGE is correct. It is read-then-write, without the need for a key violation. The read is an MVCC snapshot like all consistent reads. Your without_merge is different, a special case using a unique key exception. PostgreSQL implements both, with MERGE and INSERT … ON CONFLICT and have the same semantic difference

Leave a Reply

Your email address will not be published. Required fields are marked *