Recursive ETL
Hi guys,
I am pulling data from from our db looking like
Date|Name|Value
DATE and NAME is the unique key
I sort of understand the recursive ETL and it will append and replace old data with new data but I am missing the point slightly. Should the new data do a complete pull from the db or a subset?
I am looking to just get changes from our db since the last pull and merge those in rather than keep pulling years worth of data all of the time?
Having just watched a domo domopalooza am I right in thinking my query to our db would just pull changes say in the last 10 days to make the set smaller?
Any help would be appreciated.
Best Answer
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Answers
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
@GrantSmith great thanks.
Yes I believe I have this all working now. Pulling changes from the source every 30 mins and checking two keys on the records seems to be working.
1 -
My dataset has about 5.6 million records and the recursive etl to bring in new records takes 35 seconds.
Does that sound reasonable?
0 -
Yeah that sounds about reasonable. The one caveat to recursive dataflows is the don't scale the best as the larger the dataset grows the longer it will take to run the ETL (more data to transfer means more time).
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 102 SQL DataFlows
- 626 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 753 Beast Mode
- 61 App Studio
- 41 Variables
- 692 Automate
- 177 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive