How to count strings in a column containing a comma delimited list of strings in each row
Hello, I have a field in which each row contains a comma delimited list of strings. I'm trying to find a way to display how often each string appears in the dataset. For example, the column might be "City":
Row 1 | New York,Boston,San Francisco
Row 2 | Boston
Row 3 | New York,San Diego
Row 4 | New York,Topeka
I like to create some data display that shows New York:3, Boston:2, San Francisco:1, San Diego:1, Topeka:1.
A regular bar graph doesn't recognize the comma and treats each unique string including the commas as a separate entry.
I can split the field into multiple, City1, City2, City3, etc, but I don't know of a data display that looks into multiple columns.
Any advice is appreciated, thank you.
The way I did it in SQL, it doesn't matter how many commas there are. In ETL, I'd have to create something recusive to make sure you can get any number of elements typed into a string. If you know you won't exceed the number of delimited items you show, something like this will work.
Using that flow, I then went out and had some website give me random city names which I typed into my webform.2
You should be able to do a calculated field such as
Citylike '%New York%' then 'New York'
Citylike '%San Diego%' then 'San Diego'
Citylike '%San Francisco%' then 'San Francisco'
Citylike '%Boston%' then 'Boston'
Citylike '%Topeka%' then 'Topeka'
CASE when INSTR(
City,'New York')>0 then 'New York'
City,'San Diego')>0 then 'San Diego'
City,'Boston')>0 then 'Boston'
City,'San Francisco')>0 then 'San Francisco'
City,'Topeka')>0 then 'Topeka'
Then use the field in a bar chart1
Thanks for that solution @ArborRose. It works well for this example.
What if there are more than five possible strings, for example, what if any city in the U.S. could appear and the data set has tens of thousands of rows? It wouldn't be reasonable to make a INSTR for every potential. Any ideas on how to make this scalable?0
I'd need to ponder a bit. First instinct - try to break up the string by the delimiter, in this case a comma. Feed the pieces to a table (aka a list) and then aggregate using count.0
@tom_rotchadl@tom_rotchadl How many cities could be listed in a single row? I'd suggest using a combination of the Split Column and Dynamic Unpivot ETL tiles:0
As @MichelleH mentions, in the ETL I'd be looking to use split to get the pieces from the string. I don't have a chance right now to try it in Domo, but I did a quick mock up in SQL with code shown below.
As I mentioned, I would split the strings to create a table as shown below. Then aggregate the table using count to get the source for the bar chart.0
- 7.7K All Categories
- 7 Connect
- 922 Connectors
- 245 Workbench
- 477 Transform
- 1.8K Magic ETL
- 60 SQL DataFlows
- 446 Datasets
- 57 Visualize
- 205 Beast Mode
- 2.1K Charting
- 8 Variables
- 23 Cards, Dashboards, Stories
- 4 Automate
- 349 APIs & Domo Developer
- 84 Apps
- 15 Predict
- 3 Jupyter Workspaces
- 12 R & Python Tiles
- 242 Distribute
- 60 Domo Everywhere
- 241 Scheduled Reports
- 15 Manage
- 36 Governance & Security
- 42 Product Ideas
- 1.1K Ideas Exchange
- 2 Community Forums
- 15 Getting Started
- 2 Community Member Introductions
- 50 Community News
- 18 Event Recordings
- 576 日本支部