Snapshotting Data
Hi all,
Any help with the following would be much appreciated.
My original dataset is from a proprietary software that is updated daily.
I have a dataset where I would like to calculate the total number of units we have month-on-month. Currently, I can calculate this as a Summary Number on a card - but I would like to be able to make a card from this data and utilise it.
The Summary Number is calculated by:
- count of
UniqueUnitID
- filtered to remove where
columnX
= "REMOVED" andcolumnY
= "NotApplicable"
columnX
will hold a value until it is physically removed at which point it is updated to "REMOVED" status. Currently, there is no date/time associated with the move of this from the previous value to "REMOVED", so it is difficult to track the Summary Number for each month.
Rows are also added on to the original dataset, but there is a DateAdded
column that shows when this has happened. Each row has a unique value in the UniqueUnitID
column.
I have been reading about recursive dataflows and believe this may be the way to go, but would appreciate any help in setting this up.
Thanks in advance and hope this is clear enough.
Jack
Answers
-
Hi @JackMcG, have you read through these knowledge articles:
Magic ETL 1.0: Creating a Recursive/Snapshot Magic ETL DataFlow
Magic ETL 2.0: Creating a Recursive/Snapshot DataFlow in Magic ETL v2 (Beta)
10 -
You can also use the Dataset Copy connector [https://domohelp.domo.com/hc/en-us/articles/360043436533-DataSet-Copy-DataSet-Connector] to create a snapshot (with the update setting being Append).
10 -
Looks like I need to wake up earlier get my answers in before @amehdad :)
I agree with @amehdad but wanted to add some clarification - The MagicETL or MySQL recursive dataflows will allow you to filter possible duplicate records however the larger your dataset becomes the slower / less performant it becomes because there's more data that needs to be imported and read in.
Using Dataset Copy (with append) will be more performant as it's only appending the records to your dataset with less data to import and process however its processing will be detached from the underlying dataset you're appending. Meaning with the recursive dataset you can trigger it to run based on when the underlying dataset is updated however with Dataset Copy you have to run it at a certain time and have the possibility to have the Dataset Copy run before the underlying data has been updated.
In your use case you'll need to utilize a recursive DataFlow and not the Dataset Copy as you're needing to process the data to update the status and add a datetime when that status changed (removed from the original dataset)
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**1 -
Hi Jack, how did you go with this?
8 -
Hi both,
Thank you for your replies on this. I have still been working through the problem.
Currently, I have created a dataset connector that appends on update. I understand the issue with the appended dataset updating before the main dataset, however the main dataset updates every night and I do not require a high level of granularity on the appended dataset, and I believe the performance/reliability of the dataset connector is likely to make this the best option for me.
Currently I have the appended dataset updating every night but the aim of the card is to track week-on-week, so I may be able to push this out.
Next steps for me will be to remove duplicates where all fields are the same with the exception of the _BATCH_ID_ and BATCH_LAST_RUN. I am looking to keep the earliest version of the row as this will allow me to track what is moving into the "REMOVED" status week-on-week / month-on-month.
I will update later with progress on this.
Thanks,
Jack
1 -
the recursive dataflow is super easy. The trick is to get the dataset set up and then append to daily. I followed the help article Domo had for my salesforce Leads object. Took a little finagling because the article doesn't explain where the original dataset comes from. Hint: you have to make it.
It's saving snapshots of my data every night at midnight.
0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 738 Beast Mode
- 57 App Studio
- 40 Variables
- 685 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 47 Workflows
- 10 DomoAI
- 36 Predict
- 15 Jupyter Workspaces
- 21 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 124 Manage
- 121 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive