Hello,
In our data team, we are trying to build a recursive SQL dataflow to save historical data and append new data as the original dataset is updated daily, following the instructions in this post: https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Creating_a_Recursive%2F%2FSnapshot_SQL_DataFlow. However, the way it is set up right now, we are getting the data from the day when we set it up and then the last day when the dataset run, but then this data is erased and replaced with the latest data, so that we end up with only two dates worth of data, but nothing in between.
These are the steps we followed to build this dataflow:
1. Run the following query on our original table to create the dataset we need (all within Domo). Let's call this dataset sampledataset
SELECT club_id AS 'Club', COUNT(member_id) AS 'Client id', NOW() as 'Date'
FROM member_table
WHERE inactive=0 AND is_employee=0
GROUP BY club_id
Update setting: run the dataflow when member_table updates
2. Create a SQL dataflow with sampledataset as input and the following query as output and run it. Let's call this output historicaldataset
SELECT *
FROM sampledataset
No update settings selected
3. Once step 2 is completed, In the same dataflow add historicaldataset as an input
4. Create the following transform:
SELECT *
FROM historicaldataset
WHERE `Date`NOT IN (SELECT `Date`FROM sampledataset)
Generate output table called historical_data
5. Add this additional transform:
SELECT * FROM sampledataset
UNION ALL
SELECT * FROM historical_data
Generate output table called append_new_data_to_historical
6. Generate output using:
SELECT *
FROM append_new_data_to_historical
Dataflow update settings: update only sampledataset.
Obviously we are not setting up something correctly because the data is not being appended, but replaced after the next run. Any help will be greatly appeciated.
Thanks!