Can you remove duplicated rows of data in a data set?
We have a data set that is updated from a Google Sheet. I see that back in June we have duplicate entries for June 22 and June 23 in the data set, yet the data in the Google Sheet is not duplicated.
Is there a way to edit the data that is in the data set to remove these duplicates?
Answers
-
Hi @AlexF
Short answer: not easily
if possible you could completely reimport your dataset with the replace update method if you have all of your data
You could use a magic Etl dataflow to remove the duplicates with the remove duplicate tile or use a group by and group all your columns together and take the min or max value for the metric field you need to pull. The grouping method could also be done in a dataset view.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**3 -
Thanks Grant. I am really puzzled why this occurred in the first place.
0 -
@AlexF - We have gotten in the habit with a lot of datasets of just expecting that at some point something can go wacky and we can get duplicate data.
So we think about data coming in and wonder if there is a way for duplicate data to come in (a script runs twice ... whatever) and build that into our dataflows.
Maybe we just have bad luck but it is that common for us that we just account for it almost every time.
0 -
@AlexF Domo has two ingest methods when receiving data, either a full REPLACE (all rows in the dataset are truncated and then the new data is brought in) or APPEND (rows are tacked on to the bottom). That's it.
There is also UPSERT or PARTITIONing but both of those are based on the APPEND concept with a little data processing, and usually you know if you're using those features.
Check your connectors. Is it append or replace? if it's APPEND an easy way to get duplicate rows is to run the same connector twice in the same day.
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 -
Thanks
0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 102 SQL DataFlows
- 626 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 755 Beast Mode
- 61 App Studio
- 41 Variables
- 693 Automate
- 178 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive