Removing Dataset Outputs from Magic ETL

We have been cleaning up our Domo instance removing unnecessary outputs from Magic ETL flows. I am assuming these datasets aren't truly being deleted, just disconnected from their source to not get updated any more. Is there a way to find these "orphaned" outputs so that they can be properly deleted?

Best Answer

  • ArborRose
    ArborRose Coach
    Answer ✓

    I think DomoStats has:

    - Dataset ID
    - Dataset Name
    - Dataflow Name (if the dataset is used as part of a dataflow).
    - Input/Output Type (whether the dataset is an input or output for a given dataflow).

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

Answers

  • Orphaned datasets would be datasets that, when investigated in DomoStats, would not have "In Use in Cards" or "In Use in Dataflows". So you should be able to use DomoStats Connector to connect to DomoStats -Dataset Usage and filter for no references in cards or dataflows, especially with a long "last accessed" timestamp.

    CASE
    WHEN `In Use in Cards` = 0 AND `In Use in DataFlows` = 0 THEN 'Orphaned'
    ELSE 'In Use'
    END

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • When I am looking at "in use in dataflows" it seems to be pulling dataflows that are inputs for flows instead of both inputs and outputs.

    What I am looking for are datasets that no longer have a dataflow associated with them at all (aka no dataflow is connected to power the dataset). I suppose I misused the term orphaned in this context as the number of cards connected is irrelevant.

    Put differently, is there a way to see the input dataflow for my datasets in a report?

  • ArborRose
    ArborRose Coach
    Answer ✓

    I think DomoStats has:

    - Dataset ID
    - Dataset Name
    - Dataflow Name (if the dataset is used as part of a dataflow).
    - Input/Output Type (whether the dataset is an input or output for a given dataflow).

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • Awesome, I was able to join the DataFlow Output Sources from domostats with Datasets from domostats to identify what I need (datasets from import_type = dataflow and Dataflow ID is null)