Recursive ETL
Hi guys,
I am pulling data from from our db looking like
Date|Name|Value
DATE and NAME is the unique key
I sort of understand the recursive ETL and it will append and replace old data with new data but I am missing the point slightly. Should the new data do a complete pull from the db or a subset?
I am looking to just get changes from our db since the last pull and merge those in rather than keep pulling years worth of data all of the time?
Having just watched a domo domopalooza am I right in thinking my query to our db would just pull changes say in the last 10 days to make the set smaller?
Any help would be appreciated.
Best Answer
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Answers
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
@GrantSmith great thanks.
Yes I believe I have this all working now. Pulling changes from the source every 30 mins and checking two keys on the records seems to be working.
1 -
My dataset has about 5.6 million records and the recursive etl to bring in new records takes 35 seconds.
Does that sound reasonable?
0 -
Yeah that sounds about reasonable. The one caveat to recursive dataflows is the don't scale the best as the larger the dataset grows the longer it will take to run the ETL (more data to transfer means more time).
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Categories
- All Categories
- 1.7K Product Ideas
- 1.7K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 292 Workbench
- 4 Cloud Amplifier
- 8 Federated
- 2.8K Transform
- 95 SQL DataFlows
- 602 Datasets
- 2.1K Magic ETL
- 3.7K Visualize
- 2.4K Charting
- 695 Beast Mode
- 43 App Studio
- 39 Variables
- 658 Automate
- 170 Apps
- 441 APIs & Domo Developer
- 42 Workflows
- 5 DomoAI
- 32 Predict
- 12 Jupyter Workspaces
- 20 R & Python Tiles
- 386 Distribute
- 111 Domo Everywhere
- 269 Scheduled Reports
- 6 Software Integrations
- 113 Manage
- 110 Governance & Security
- 8 Domo University
- 30 Product Releases
- Community Forums
- 39 Getting Started
- 29 Community Member Introductions
- 98 Community Announcements
- Domo Community Gallery
- 4.8K Archive