As my luck would have it, the genius architects back in IT do an awful lot of INSERTS and UPDATES. At least, it looks that way to me. Out of a 63M row DB, I find about 600K INSERTS and 1M UPDATES weekly.
What's a MajorDomo to do? The support for UPSERTS via MySQL and Domo isn't all that good. I tried but just could not make it work.
I'm having better luck with Redshift at the moment. With a dataflow, I basically encapsulate the INSERTS and REPLACEs in a transaction, as follows:
begin transaction;
--- delete all rows from the source table where the primary key (row_wid) matches those of the rows to be INSERTED.
DELETE from w_quote_f
USING w_quote_f_inserts
WHERE w_quote_f.row_wid = w_quote_f_inserts.row_wid;
--- do the inserts
--- first id the fields to be changed. Since the primary key doesn't match (this is an INSERT), primary key is listed in the field listing. This proved to be very helpful in troubleshooting the next issue, field ordering. Turns out that the fields need to be listed in the exact order so that the field type definitions (character, bigint, etc.) all line up correctly.
INSERT INTO w_quote_f
(
list of fields between parentheses
)
--- then get the replacement values
SELECT list of fields in the same order
FROM w_quote_f_inserts;
end transaction;
The really strange part is that this takes in about 29GB of data but spits out about 49GB. I have no idea what's going on behind the scenes. The row counts match.
Weird. Naturally, because I am lazy, I do the exact same thing for UPDATES.
Anyway, I thought this might help someone.