Use Case for Recursive CTE -- Or is there an alternative?

Options
BSovers
BSovers Member
edited March 11 in Magic ETL Ideas

I am pulling DomoStats data sets for:

DataFlow Input DataSets

DataFlow Output DataSets

First Step - Combining

input_output defined as:

Select 	dfin."Dataflow ID" as dfid
, dfin."Datasource Input ID" as dsin
, dfout."Datasource Output ID" as dsout
from "dataflow_input_datasets" as dfin
join "dataflow_output_datasets" as dfout on dfin."Dataflow ID" = dfout."Dataflow ID"

Then trying to put them together in a RedShift Recursive CTE

WITH RECURSIVE LevelN (lvl, dsin, dsout)  as
( SELECT 1 as lvl
, dsin
, dsout
FROM "input_output"
WHERE dsin != dsout UNION ALL

SELECT A.lvl + 1 as lvl
, A.dsin as dsin
, B.dsout as dsout
FROM LevelN as A
JOIN "input_output" as B on A.dsout = B.dsin
WHERE A.lvl < 10 and A.dsin != B.dsout ` )
SELECT lvl,dsin,dsout from LevelN group by 1,2,3

I had to add the A.lvl < 10 or RedShift would throw an error on the recursion limit.

This was working before our dataflows got too complicated. Now it is now failing with the error:

The database reported an error: Hit recursive CTE max rows limit, please add correct CTE termination predicates or change the max_recursion_rows parameter. Details: ----------------------------------------------- error: Hit recursive CTE max rows limit, please add correct CTE te
The whole goal of this is a dataset with ALL input datasets that are required for a given output dataset. I was planning to use this to verify that a dataset was not connected to any cards that had been accessed in the past year (also using the DomoStat data set for activity)Has any one else found a way to do this?

Possible solutions (That I have tried)—

  • Do as suggested in the error — change the max_recursion_rows parameter (I created a support ticket for this and got nowhere but this "idea"/plea, Engineering said that they cannot change the RedShift parameter.)
  • Try to get Domo to update MySQL to a version that supports recursive CTE's (yet another support ticket that is un-answered in almost 2 years)
  • Allow Magic ETL to do recursive processing that is not just one-run at a time (This would need some mechanism to continue to trigger a dataflow to refresh, until a goal is achieved)
  • Recursive CTE in a View (now that we can get to the SQL definition of the view) — Tried this, but whatever variant of SQL being used for Views does not support CTEs, let alone recursive CTEs
Untried Possibilities
  • Use the archaic version of MySQL and write a stored procedure to handle the recursion (I was not aware that we could write MySQL stored procedures, so I never tried this)
  • Switch to a different BI / ETL tool that allows modern DB activities — I have thought of this one!

1
1 votes

Comments

  • jace_at_domo
    Options

    Hello-

    I have actually written this before in python. Do you have access to jupyter, workflows or python tiles in Magic?

    Thanks!

    —jace

  • BSovers
    BSovers Member
    edited March 29
    Options

    @jace_at_domo — I only have access to Magic ETL 2.0, MySQL, and Redshift transforms. I am very disappointed that NONE of these seem to be able to handle something that would be simple in a modern database!