Is there a way to identify orphan datasets that have no flows or cards dependent on them?


I'm looking to do two things. 1) Clean up unused datasets 2) Identify the most used datasets. Are there cards that will help with this?


  • GrantSmith

    Hi @user036002 ,


    You'll want to look at either the Domo Governance Third Party datasets. They contain information about the different objects within Domo along with meta data.


    Specifically the Data Set Details dataset to see card count = 0.

    You'll also need to do some ETL magic to join the datasets to dataflows to see that dataset isn't an input data set on a dataflow.


    The inverse would help you to identify the most used datasets (sum(card count) + count(dataflows with dataset as input).



    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • (domo governance) DG Dataflow Details will tell you which datasets are used as inputs into a dataflow.  JOIN that to the DataSet dataset to identify which datasets are not used in ETL.  Be careful of the treatment of fusions if you use them, b/c they don't follow the same dataset input / output pattern.


    consider metrics like 'not used in dataflow' , 'last time dataset updated', 'last time dataflow was modified' some people set up fire and forget update schemes so the data looks current but it's actually a stale unused pipeline.  consider counting the number of cards AND / OR the number of pages a card is shared to (DG Cards and Pages).  


    Also consider tying in the Activity Log to count how often a cards from a dataset were viewed in the last 90 days



    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"