Can I use Domostats to see which datasets feed into dataflows?

I've started using Domostats to see how many datasets and dataflows we have and to start building a holistic view of the system and a naming convention for our internal datasets.

 

I can pull in the names and Id's of the dataflows and datasets, and I can pull in the history of the datasets which gives me which dataflow creates a dataset.

 

But I can't find which datasets are used by dataflows and which are orphans.

 

Am I just missing something? 

Any help would be appreciated

Best Answer

  • AS
    AS Coach
    Answer ✓

    I haven't been able to use DomoStats for this purpose, but it's also often been requested that we have the functionality to see upstream and downstream lineage of dataflows.  We can only currently view upstream lineage, unfortunately.

    You used to be able to search the name of a dataset in the dataflow portion of data center and have all the dataflows it goes into also appear in the search, but it looks like that is no longer working.

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Answers

  • AS
    AS Coach
    Answer ✓

    I haven't been able to use DomoStats for this purpose, but it's also often been requested that we have the functionality to see upstream and downstream lineage of dataflows.  We can only currently view upstream lineage, unfortunately.

    You used to be able to search the name of a dataset in the dataflow portion of data center and have all the dataflows it goes into also appear in the search, but it looks like that is no longer working.

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • I would love to see this improvement!  Just how the Dataset Domostats dataset has the Cards_Powered count field...a count field for Dataflows_Included_In would be very beneficial to spot orphans.

     

    Glad to know I'm not the only one pondering an easier method to tracking this down!

  • The ability to visualize all the dataflows that contain a particular dataset would be a really big 'leg up' on doing housekeeping in our DOMO instance! While we can see all the datasets that feed into a dataset through the 'lineage' view, there is no visability going the other way.  Over time, our instance has filled with datasets that we really have little if any visablity into their utility. 

     

    While we can flag a dataset through naming as being obsolete, the only way we fully know if a dataset is used to feed something into a dataflow is to drop it and wait for the fall out. 

     

    Can pretty quickly resolve fallout in frequently running dataflows, those that update only periodically are more problematic.  

     

    In the meantime, the unidentifed debris in our instance continues to grow.

  • You can do that using the DataGovernance connector to be able to see all dataflows and datasets as well as what is attach to what.  it will take some work though to build the flow that puts everything together.

    Take a look at this post below to have a better idea of how to do it

     

    https://dojo.domo.com/t5/Data-Sources-and-Connectors/How-to-get-the-information-for-Domo-Governance-Datasets-to-the/m-p/40918#M2723

    Domo Arigato!

    **Say 'Thanks' by clicking the thumbs up in the post that helped you.
    **Please mark the post that solves your problem as 'Accepted Solution'
  • Godiepi - Next DomoPalooza I owe you a 'beverage'!  Didn't know we had access to the DataGovernance data in our instance.  So much new stuff has been pushed out - which is a great thing - that it is hard to keep up with it all and still keep on top of our usual drudge work.  

     

    Have the datasets populating now, and will see about making sense of them soon.  

     

    Appreciate both your 'quickness' in responding to my post - AND for your pointing me in the right direction for this.  

     

    Thank you! Thank you!

  • @LardnerPete  Awesome , I'm glad I could help you ??

    Domo Arigato!

    **Say 'Thanks' by clicking the thumbs up in the post that helped you.
    **Please mark the post that solves your problem as 'Accepted Solution'
  • I do not know when they were added, but there are now DomoStats reports for: DataFlow Input DataSources and DataFlow Output DataSources. Combining the two lets you see the whole "envelope" around the dataflow, and connecting inputs to outputs let you see the full chain of processing (just remember to filter out any output that matches as an input anywhere in the chain, or you will get infinite loops.)