Update Method option on dataflow
So I was editing a redshift dataflow this morning and I noticed a new option on the output dataset named Update Method. It has a replace and an append option. I am not sure when this became available or how it should be implemented. I understand that all current datasets replace themselves and normally to append you need to create a recursive dataflow feeding the base data back into itself. Could someone give me an example of a use case for the append option?
@cwolman, yes that is a great use case for it.
Key things to note:
Former Domo employee you can find me in the Dojo Community here @n8isjack0
- There must be no overlap between Dataset A and Dataset B. You cannot update records that already loaded into Dataset A using this new method.
- Correcting errors is trickier. If a data load must be reloaded or was loaded twice it is difficult to correct it using the new method.
Hi @cwolman, this is a pretty exciting change but it is for specific situations. It likely will not replace your recursive dataflow.
It will allow you to take data that can simply be appended, but modify it first. Say that you are loading sales transactions. Dealing with hundreds of millions of rows is slow but you can just append the data. This is great, but if you need to do data prep, cleanup, filtering, etc... it has to be done in every card using the data. This new method allows you to transform it before appending it to the dataset.Former Domo employee you can find me in the Dojo Community here @n8isjack1
Would this new functionality work for this scenario?
Basic recursive dataflow
Dataset A - contains 25M rows (base data)
Dataset B - contains 1M rows (new data)
transform dataset B and union to Dataset A for final output. Dataset A now contains 26M rows. Rinse and repeat daily.
Could I edit this existing dataflow and remove Dataset A as an input and simply transform Dataset B and have it append to the output dataset using this new feature?
This would allow me to eliminate the time required to load Dataset A first which would decrease processing time.
- 7.7K All Categories
- 5 Connect
- 921 Connectors
- 244 Workbench
- 477 Transform
- 1.8K Magic ETL
- 60 SQL DataFlows
- 446 Datasets
- 40 Visualize
- 199 Beast Mode
- 2K Charting
- 8 Variables
- 20 Cards, Dashboards, Stories
- 1 Automate
- 348 APIs & Domo Developer
- 82 Apps
- 14 Predict
- 3 Jupyter Workspaces
- 11 R & Python Tiles
- 241 Distribute
- 59 Domo Everywhere
- 241 Scheduled Reports
- 15 Manage
- 36 Governance & Security
- 28 Product Ideas
- 1.1K Ideas Exchange
- Community Forums
- 14 Getting Started
- 1 Community Member Introductions
- 49 Community News
- 18 Event Recordings
- 579 日本支部