Recursive ETL
Hi guys,
I am pulling data from from our db looking like
Date|Name|Value
DATE and NAME is the unique key
I sort of understand the recursive ETL and it will append and replace old data with new data but I am missing the point slightly. Should the new data do a complete pull from the db or a subset?
I am looking to just get changes from our db since the last pull and merge those in rather than keep pulling years worth of data all of the time?
Having just watched a domo domopalooza am I right in thinking my query to our db would just pull changes say in the last 10 days to make the set smaller?
Any help would be appreciated.
Best Answer
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Answers
-
Correct, you'd only pull in the changes that you'd need to be applied to keep your data processing quicker (less records). You'd need an initial pull of all your data to establish your baseline but then can just pull in the records that changed since the last time you've run it.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
@GrantSmith great thanks.
Yes I believe I have this all working now. Pulling changes from the source every 30 mins and checking two keys on the records seems to be working.
1 -
My dataset has about 5.6 million records and the recursive etl to bring in new records takes 35 seconds.
Does that sound reasonable?
0 -
Yeah that sounds about reasonable. The one caveat to recursive dataflows is the don't scale the best as the larger the dataset grows the longer it will take to run the ETL (more data to transfer means more time).
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0
Categories
- All Categories
- 1.9K Product Ideas
- 1.9K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 302 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 104 SQL DataFlows
- 636 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 761 Beast Mode
- 65 App Studio
- 42 Variables
- 701 Automate
- 182 Apps
- 457 APIs & Domo Developer
- 52 Workflows
- 10 DomoAI
- 39 Predict
- 16 Jupyter Workspaces
- 23 R & Python Tiles
- 401 Distribute
- 116 Domo Everywhere
- 277 Scheduled Reports
- 8 Software Integrations
- 132 Manage
- 129 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 111 Community Announcements
- 4.8K Archive