Snapshotting Data

Question

Hi all,

Any help with the following would be much appreciated.

My original dataset is from a proprietary software that is updated daily.

I have a dataset where I would like to calculate the total number of units we have month-on-month. Currently, I can calculate this as a Summary Number on a card - but I would like to be able to make a card from this data and utilise it.

The Summary Number is calculated by:

* count of UniqueUnitID
* filtered to remove where columnX = "REMOVED" and columnY = "NotApplicable"

columnX will hold a value until it is physically removed at which point it is updated to "REMOVED" status. Currently, there is no date/time associated with the move of this from the previous value to "REMOVED", so it is difficult to track the Summary Number for each month.

Rows are also added on to the original dataset, but there is a DateAdded column that shows when this has happened. Each row has a unique value in the UniqueUnitID column.

I have been reading about recursive dataflows and believe this may be the way to go, but would appreciate any help in setting this up.

Thanks in advance and hope this is clear enough.

Jack

Canioagain · Answer

the recursive dataflow is super easy.   The trick is to get the dataset set up and then append to daily.  I followed the help article Domo had for my salesforce Leads object.  Took a little finagling because the article doesn't explain where the original dataset comes from.  Hint: you have to make it.

It's saving snapshots of my data every night at midnight.

JackMcG · Answer

Hi both,

Thank you for your replies on this. I have still been working through the problem.

Currently, I have created a dataset connector that appends on update. I understand the issue with the appended dataset updating before the main dataset, however the main dataset updates every night and I do not require a high level of granularity on the appended dataset, and I believe the performance/reliability of the dataset connector is likely to make this the best option for me.

Currently I have the appended dataset updating every night but the aim of the card is to track week-on-week, so I may be able to push this out.

Next steps for me will be to remove duplicates where all fields are the same with the exception of the _BATCH_ID_ and BATCH_LAST_RUN. I am looking to keep the earliest version of the row as this will allow me to track what is moving into the "REMOVED" status week-on-week / month-on-month.

I will update later with progress on this.

Thanks,

Jack

amehdad · Answer

Hi Jack, how did you go with this?