Monday, October 17, 2016

Unreliable affected-rows with conditional upserts in MySQL

This post is about my attempts with affected-rows for MySQL conditional Upserts to be unreliable.

Image result for conditionalUPSERTs are tempting, with perks like single round trip to DB,  atomic properties, and simpler SQL client code.
Then, if your update depends on the existing row data, there's even Conditional UPSERT.
A conditional UPSERT can have one of three outcomes:
  1. Row is inserted as a new row
  2. An existing row/column is updated (condition evaluated True).
  3. Existing row/column wasn't updated (condition evaluated False).
The SQL client can tell the UPSERT outcome with the Affected-rows value.

For non-conditional UPSERTs the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values.

For conditional UPSERTs I saw non consistent results. affected rows was 2 OR 3, when an existing row was updated (condition true), 2 when an existing row wasn't updated. 1 for new row is inserted.
Such inconsistent behavior could easily cause bugs. I didn't find a clear pattern of when it happens. I also didn't find proper documentation of what should is the expected affected-row value. 
Since the ON DUPLICATED KEY UPDATE uses an IF statement I suspect it loses track of the results of the IF statement. Also each column's condition can evaluated differently than others columns conditions.


without knowing what to expect of the # of affected rows, and UPSERTs being easily breakable in nature when switching columns update order). I decided to ditch them completely, and ended up implementing the conditional logic in Java client side, using more than one SQL command wrapped in a transaction.

 My conditional UPSERT (expect inconsistent affected rows value):

INSERT INTO account_last_touch
            (service_id, account_name, user_name, touch_time)
            VALUES('123', '456', 'u1', '2016-07-19 12:11:15')
 =IF('2016-07-23 12:11:15'>touch_time, 'u1', user_name), 

touch_time = IF('2016-07-24 12:11:15'>touch_time, '2016-07-24 12:11:15', touch_time)