I have a large dataset that needs some cleaning. We have appended to this dataset daily over several years and some of the data values have drifted over time as users have renamed what are essentially labels for different entities. If this were SQL, I could write an update statement and quickly replace the values in label fields based on an id field. For example:
UPDATE dataset JOIN updated_labels ON dataset.id_column = updated_labels.id_column SET dataset.label_field = updated_labels.label_field;
Is there a way to run this kind of data editing operation directly in Domo? So far, the solutions I find fall into these categories:
- use data edit mode to select batches that were appended to the data, remove those batches, and then re-import the suspect data batches
- use an ETL workflow to transform the data and output to a new dataset
The first option won't work here because there are far too many historical appends to go back and re-import them all. It would take a considerable amount of effort to rebuild the data.
The second option means treating the original dataset as immutable and copying it over to an identical dataset with the labels fixed. The drawbacks here are it would be a waste of storage to keep two copies of the dataset, and I would have to edit all of my cards to point to the new dataset.
Is there instead a way to simply clean an existing dataset? I just need a way to run a one-time, ad-hoc query to realign data values.