Remove duplicate unknown comma separated values within a single 'cell'
Hello good people of Domo
Very frustrated trying to remove duplicates WITHIN a comma seperated list. I have the following webform:
If I 'Group By' (Magic ETL) by 'Who' & 'Manufacturer' (combining 'Model' & 'Colour' with commas) I get the below. This has repeats in R5:C3 (2x 'TT'), R6:C3 (2x V70) and 'Black' is repeated in R5:C4 & R6:C4.
What I would like: The above without the duplicates in the cells mentioned.
Other things you might need to know:
- My real data has nearly 1,000,000 rows in it.
- I will not know (ahead of time) what the value will be
- There could be one value, two values or any number of values separated by the comma. I won't know.
I've tried daisy chaining multiple 'Group By' tiles together; I've tried pulling data out, dealing with it and joining it back later (gets messy).
I'm happy to do this in ETL (my preference) but could do it in beast mode if better.
Any ideas, please?
Thanks Domos
Charlie
Best Answers
-
Feed your dataset into a select columns to pull the who manufacturer and model then put that into a remove duplicates based on the who and manufacturer. Then feed it to a group by and separate with commas. Repeat this process replacing model with color. Then join your two group bys based on who and manufacturer
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
@Charlie Bantoft It sounds like you want this formula:
GROUP_CONCAT(DISTINCT Model SEPARATOR ', ')
Use the "Add Formula" button instead of the "Add Column" button in Group By and use that formula in place of the "Combine strings separated by ," aggregations. Don't forget to replaceModel
withColour
for the other one.
EDIT: For context, the "Combine strings separated by ," aggregation is identical to the formula (note that it does not include theDISTINCT
keyword):GROUP_CONCAT(the_column SEPARATOR ', ')
Randall Oveson <randall.oveson@domo.com>
1
Answers
-
Feed your dataset into a select columns to pull the who manufacturer and model then put that into a remove duplicates based on the who and manufacturer. Then feed it to a group by and separate with commas. Repeat this process replacing model with color. Then join your two group bys based on who and manufacturer
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
You can also try concatenating all of your columns together into one column, deleting duplicates based off of that column, then splitting the columns back out based off your chosen delimiter.
0 -
Thanks both. @jessdoe I love this idea and it'll work for other problems I have but not this one. Thanks anyway.
@GrantSmith I think I follow but could you give me a little more detail please? Thanks so much.
0 -
Here's a screenshot with an example (it's using bogus data but illustrates what I'm talking about):
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**2 -
Thanks @GrantSmith I've got that flow but am having trouble with the configuration somewhere. I'll continue to play around with it - I'm on the right track. Thanks very much indeed.
0 -
@Charlie Bantoft It sounds like you want this formula:
GROUP_CONCAT(DISTINCT Model SEPARATOR ', ')
Use the "Add Formula" button instead of the "Add Column" button in Group By and use that formula in place of the "Combine strings separated by ," aggregations. Don't forget to replaceModel
withColour
for the other one.
EDIT: For context, the "Combine strings separated by ," aggregation is identical to the formula (note that it does not include theDISTINCT
keyword):GROUP_CONCAT(the_column SEPARATOR ', ')
Randall Oveson <randall.oveson@domo.com>
1 -
Thank you! Once I have this column sorted @rco how would you advise filtering on its results with an in-dashboard filter?
so, for example: if a row has the data string 'Volvo, Porsche, Austin' and I want to filter all rows that 'contain' 'Porsche' using a drop down that has all car manufacturers on it - how would I do that?
The only solution I've found is to use a quick filter which means opening individual cards but I'd like the filter to control the whole dashboard.
Many thanks.
0 -
Since there isn't a "string contains" operator in the dashboard filters, I've had to come up with a somewhat more tedious way of accomplishing this. It does involve opening and modifying all the cards individually, but only during the setup; once it's done you can use a single control on the dashboard to do your filtering.
On the first card…
1. Edit in Analyzer and Add Calculated Field. We'll call it "Model Match"
2. In the Beast Mode editor, go to the Variables tab and Add Variable. Leave its default value empty, let it have type Text, and add a Control for it of type Textbox Input. We'll call the variable "Control_Model"
3. Still in the Beast Mode editor, write this formula:CASE WHEN INSTR(LOWER(Model), LOWER(Control_Model)) > 0 THEN 1 ELSE 0 END
4. Validate and save the "Model Match" calculated field.
5. Add the "Model Match" field as a filter for the card, filtering to the case where it is equal to 1.
6. Save the card and exit Analyzer.
On every card on the dashboard, repeat steps 1 and 3 through 6, skipping 2 since we only have to create the variable once.
Finally, on the dashboard, hit the wrench menu and Edit Dashboard. Add a Control where you choose, and make control the "Control_Model" variable.
With that, you should be able to use the one control to filter the whole dashboard to specific models in the concatenated model list. When the control is left blank, it should show all data.Randall Oveson <randall.oveson@domo.com>
0 -
Thanks very much indeed @rco - going to try this shortly. Will let you know how I get on. Really appreciate it.
0 -
This is been very useful by the way @rco. Thank you very much indeed.
1
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 737 Beast Mode
- 56 App Studio
- 40 Variables
- 684 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 46 Workflows
- 10 DomoAI
- 35 Predict
- 14 Jupyter Workspaces
- 21 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 123 Manage
- 120 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive