Removing Datasets

ARosser
ARosser Member

I have been in the process for the past 6ish months of revamping the underlying data structure that feeds into Domo. As part of this I have created new dataflows and datasets in Domo. I still have a handful of reports that are populated from the old dataflow. This dataflow is turned off and the dataset is stale, but we still need those reports for a bit longer as they have historical data we have not integrated into the new process yet.

So my question, If I were to delete the input datasets that were used in the dataflow to create the single output, would the output dataset remain in it's current state. I'm asking because those input datasets are taking up a LOT of credits and it would be great to remove them.

Best Answer

  • ColemenWilson
    Answer ✓

    You could just make the dataset recursive and then you could remove the inputs.

    If I solved your problem, please select "yes" above

Answers

  • david_cunningham
    edited June 5

    It sounds like you have entirely separate new datasets powering your new data flows. In that case, you don't have to delete the input datasets to stop using credits, just have them stop updating in the same way that you have the data flows not updating.

    To directly answer your question though. If you delete the input dataset to an ETL, the cards powered off of the ETL output won't be deleted, but the ETL will have a missing input dataset, and will not be able to run. Any cards that are powered directly off of the deleted dataset will be deleted as well. You can view the downstream impact/lineage of a dataset to see how deleting it will impact Domo.

    David Cunningham

    ** Was this post helpful? Click Agree 😀, Like 👍️, or Awesome ❤️ below **
    ** Did this solve your problem? Accept it as a solution! ✔️**

  • ARosser
    ARosser Member

    But it appears I am getting dinged on my credits for the rows themselves in those datasets. As an example, this dataset hasn't run in 9 months.

  • Ahh - gotcha. I was referring strictly to credits used from creating/updating tables. Domo does also charge for data storage at a rate of 1 credit per million storage rows per month.

    Since you are in the process of migrating, I think this might be worth reaching out to your Domo Rep to explore possible options.

    Another possible option would be to aggregate up to an acceptable grain rather than storing all rows. You mentioned that these need to be kept to power cards. Say for example the smallest date grain you are displaying is monthly. You could aggregate these up to monthly, and that would significantly reduce your storage rows and cut down on credits, while still allowing you to power your cards/ETLs that you want to keep for now. You'll have to be careful doing this to make sure that all reports that are connected via ETL will still work (don't change column names, output dataset names, etc), and as I mentioned before, you will want to only aggregate up to the smallest grain that is used.

    To be frank, I'm not sure the idea I suggested above is worth the effort. Especially if you're going to be migrating these over the next couple of months. Depends on exactly how many credits are being consumed due to storage rows.

    David Cunningham

    ** Was this post helpful? Click Agree 😀, Like 👍️, or Awesome ❤️ below **
    ** Did this solve your problem? Accept it as a solution! ✔️**

  • ColemenWilson
    Answer ✓

    You could just make the dataset recursive and then you could remove the inputs.

    If I solved your problem, please select "yes" above

  • ARosser
    ARosser Member

    @ColemenWilson That seems like the best option. Now to go down that rabbit hole.

  • Won't be too bad. Output becomes only input. Remove duplicates tile to prevent any duplicate rows. Set update to manual.

    If I solved your problem, please select "yes" above