Deduplicating a table based on the content of certain columns
I have a dataset that has about 20 columns in it. The first column contains ID numbers and a lot of the ID numbers are duplicated multiple times. All the other data in the columns are also duplicated multiple times except one column named last_updated. The last_updated column lists a date.
I would like to de-dupe this dataset and keep the rows with the most recent dates in the last_updated columns. Is there a way to do this?
0
Comments
-
The easiest way to do this is to use the Group By tile in Magic ETL. Add all your columns in the select except for the last_updated column. Add that one to the aggregated column list and choose Max. This will give you the most recent date for each.
**Check out my Domo Tips & Tricks Videos
**Make sure toany users posts that helped you.
**Please mark as accepted the ones who solved your issue.2
Categories
- All Categories
- 1.1K Product Ideas
- 1.1K Ideas Exchange
- 1.2K Connect
- 969 Connectors
- 257 Workbench
- Cloud Amplifier
- 1 Federated
- 2.4K Transform
- 76 SQL DataFlows
- 501 Datasets
- 1.8K Magic ETL
- 2.7K Visualize
- 2.2K Charting
- 375 Beast Mode
- 20 Variables
- 485 Automate
- 103 Apps
- 378 APIs & Domo Developer
- 6 Workflows
- 22 Predict
- 6 Jupyter Workspaces
- 16 R & Python Tiles
- 316 Distribute
- 64 Domo Everywhere
- 252 Scheduled Reports
- 59 Manage
- 59 Governance & Security
- 1 Product Release Questions
- 5K Community Forums
- 37 Getting Started
- 23 Community Member Introductions
- 63 Community Announcements
- 4.8K Archive