Is there a way to remove historical data from an "appending" DataSet.

rado98
rado98 Contributor

Is there a way to remove rows/data from a DataSet that has been appending rows since its creation?

The data I need to remove is older so it is not able to be removed from the History tab, nor is the data from a single upload anyways.

As the DataSet is also a Legacy DataSet from a credit usage point of view, a new DataFlow/DataSet to remove the data is not an option.

One of the DataSet is collected by email and it so so big at this state that coping all the data, wiping the DataSet and then reuploading the needed data is also not an option.

Answers

  • You might be able to utilize the Java CLI with the delete-rows command to delete specific upsert keys in your dataset. If you need to define an upsert key on your dataset you can use the define-upsert command.

    As a precaution, I'd recommend backing up your dataset and trying out on a copy first before modifying the original.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • rado98
    rado98 Contributor

    Ok, thanks, ill give that a try. I was hoping for something more user friendly in all honesty

    In principle I could just download the dataset, modify it and then reupload? Should be an easier operation I would imagine.

  • You could certainly download, re-upload, and then create a dataflow that appends your scheduled data to your newly uploaded dataset.

    If your datasets are the result of a dataflow, you could also use a recursive dataflow where you use the dataset as it's own input and output, then filter out the rows you don't want to keep. Be sure to test and keep a backup of your data with this approach until you're sure you've got it right!

    Creating a Recursive/Snapshot DataFlow in Magic ETL (domo.com)