Appending an update set to itself
Hi,
I have a MySQL dataflow that is built from API reports from a vendors application. The idea is that we did a one time historical pull of the API as dataset 1. Then, the same API that only pulls 1 day of data and appends to the historical dataset as dataset 2.
But, every once and a while it seems as if data is dropping out. I've been struggling to determine if this is an error on the API side of things, meaning the data is missing from the report, or if its an issue with how the MySQL dataflow is built, or maybe a connector issue.
So I am just wondering, does anyone see any flaws/potential ways data could be removed/left out in this following configuration? The goal here is to simply have all data continually append every day.
First, I took the "all time historical" dataset into the MySQL dataflow and created the output of "BB_Billing_Activities_Raw". Then, I modified this MySQL dataflow and replaced the "all time historical" dataset with "BB_Billing_Activities_Raw", so that it is feeding into itself.
Then, I added the "daily update" dataset that just looks at one day. and then this is the transformation
(The part I don't understand, which the vendor told me to add, is the "where u.billing-acitivty-id' is null" because this will never be null, and so the top part will return nothing. But I'm assuming this is just to make sure schema matches)
select m.*
from `bb_billing_activities_raw`m
left outer join `bb_billing_activities_update` u
on m.`billing-activity-id` = u.`billing-activity-id`
where u.`billing-activity-id` is null
union ALL
select * from bb_billing_activities_update
Answers
-
A recursive dataflow is the most common solution to this. You can read about it in this KB article:
https://domohelp.domo.com/hc/en-us/articles/360043428133-Creating-a-Recursive-Snapshot-SQL-DataFlow
There are also KB articles for doing it in Magic ETL. I would also recommend looking at this YouTube video by @jaeW_at_Onyx https://www.youtube.com/watch?v=JNQFZCj8JcQ&t=6s
**Check out my Domo Tips & Tricks Videos
**Make sure to any users posts that helped you.
**Please mark as accepted the ones who solved your issue.1 -
Hi @Jbrorby
It sounds like you’re doing a recursive dataflow. The null check is for making sure only new records are being pulled in to avoid duplicates. I’d look at your timing on when the underlying daily dataset and when you’re pulling that data to see if it’s a possibility that when you’re pulling the API data that it’s updated in your vendors system at that time.
I’m not a huge fan of recursive data flows as the larger your dataset gets the slower the performance becomes. In your case there may not be a better solution unless you don’t care about the possibility of duplicate records. If you’re using a Domo connector you could set it to append instead of replace.
Here’s a link outlining the MySQL recursive dataflow and also one for a Magic ETL version
https://domohelp.domo.com/hc/en-us/articles/360043428133-Creating-a-Recursive-Snapshot-SQL-DataFlow
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**2 -
Yes, that is one thing I have noticed is the time it takes to run continues to increase. It used to be less than an hour, and now it takes over 13 hours, and its not even a year old.
One thing I was thinking was instead of the way I currently have it, which sounds like an attempt at a recursive dataflow, is if, like Grant said, I just set the daily update connector set to append. Then I could just add tile logic to remove duplicates in the ETL.
I wonder if this might be a better/quicker option if I'm not worried about duplicates. I also think this might help me audit the API if I ever need to look back to see if it missed something.
1 -
@Jbrorby if you can avoid the recursive dataflow and make use of the append, that should prove to be a better long term solution for you as far as performance and maintenance. I would definitely recommend working through that option first and seeing if that works for you before going the recursive route.
**Check out my Domo Tips & Tricks Videos
**Make sure to any users posts that helped you.
**Please mark as accepted the ones who solved your issue.1 -
If you can’t use the connectors append update method I have utilized the DataSet Copy connector to take daily snapshots of a dataset and set that method to append.
https://domohelp.domo.com/hc/en-us/articles/360043436533-DataSet-Copy-DataSet-Connector
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**1
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 299 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 729 Beast Mode
- 54 App Studio
- 40 Variables
- 680 Automate
- 175 Apps
- 451 APIs & Domo Developer
- 46 Workflows
- 8 DomoAI
- 34 Predict
- 14 Jupyter Workspaces
- 20 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 121 Manage
- 118 Governance & Security
- Domo Community Gallery
- 33 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive