"Append to DataSet" method is creating duplicates

In a nutshell:

1) create & build the initial dataset in workbench


SELECT * FROM myTable
-- WHERE StartDateTime > '!{lastvalue:StartDateTime}!'
ORDER BY StartDateTime



2) update job method from Replace to Append:

a) uncomment my WHERE clause in my query
b) update/add "replacement variable" subtab

     Column = StartDateTime
     Value = !{lastvalue:StartDateTime}!
c) switch dataset to append

d) save & run


Results:  the dataset job appends new records as expected, the only issue is the occasional duplicate record (and sometimes triplicate records).  I know I can remove the duplicates with a dataflow, but shouldn't have to...

Any ideas why the duplicates are happening?

We also have a couple other append dataset jobs, they've all running fine for months (except for random duplicate/triplicates).  I have experimented with other queries, using primary keys instead of DateTime, same issue - happening with all append dataset jobs, not just this specific one.

Thanks

edit: added code tags in attempt to remove emoticon/smiley... : S

Comments

  • I can't tell you why, but I can tell you that we experienced some of the same issues with replacement variables.  

    In some instances we decided to take a different approach by creating multiple replacement datasets of different timeframes and stacking them into one.  

    In other cases we've kept the append, removed the replacement variable, and used a relative date, like trx_date=sysdate-1, or something similar.  We haven't seen duplicates with this method.

     

     

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Hi,

     

    So the Problem which your curently facing is because timestamp value which your using for incremental records. Even i have faced the same thing and have dropped a mail to DOMO support but as of now you cant control the syntax of the replcament variable , so you have to use the data flow for creating this.

     

    If our database has date insted of timestamp use that am sure you wont se any duplicates , also make sure to cast it in the right way.

     

     

This discussion has been closed.