How do I take a portion of a column value and push it to a new column?
I have a column for campaign names that looks roughly like this
A1234567| Campaign 1 | Google Ads | ……
What I need to be able to do is read/grab all the non-space characters before the pipe and put that value in a new column called Campaign ID. In the case above, the new column for Campaign ID would contain A1234567. If the value does not start with an 'A' (or any other letter) followed by at least 1 number then I want to write 'unassigned' in the Campaign ID column. How would I do this?
Best Answers
-
I would recommend using the split column tile in magic ETL.
It would look something like this:
If I solved your problem, please select "yes" above
1 -
@dkonig If you are using MagicETL you can use the Split Column tile to separate a column based on a particular delimiter, in this case the pipe character. You can also do this using the SPLIT_PART function in a formula tile.
Edit: @colemenwilson beat me to it!
0 -
CASE WHEN REGEXP_LIKE(`field`, '^(A\d+).*$') THEN REGEXP_REPLACE(`field`, '^(A\d+).*$', '$1') ELSE 'unassigned' END
You can use a formula tile and a regular expression to pull out the values which start with A followed by some numbers.
^ - Start of the line
() - Match group to store the results found - in this case the first match group which is stored in the variable $1
A - the literal letter A
\d+ - One or more digits
.* 0 or more of anything
$ - The end of the string
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**3 -
With your description the Regex version would seem to be the best fit. Just keep in mind that you'd need to change this since you just mentioned the campaign code might be anywhere in the text, so something like this might do the trick: ^.*\b(A\d+).*$ in this case you don't care that the code starts. You can see the comparison on matches between the two versions here:
0
Answers
-
I would recommend using the split column tile in magic ETL.
It would look something like this:
If I solved your problem, please select "yes" above
1 -
@dkonig If you are using MagicETL you can use the Split Column tile to separate a column based on a particular delimiter, in this case the pipe character. You can also do this using the SPLIT_PART function in a formula tile.
Edit: @colemenwilson beat me to it!
0 -
@colemenwilson I completely missed there was an option for 'custom' when choosing a delimiter. I swear looked for it. But this only solves part of my problem. If the first value does not begin with a letter followed by a number then those values would also be added to the new column. Is there a way to prevent that and simply write 'unassigned' or 'unknown' in that new column?
0 -
Hmm. I think, while the Split Column tile is simple, that in my case there is a little more complexity. I think this might be a case to use the Add Formula tile? I also found out that older campaign names sometimes have the A1234567 at some other location in the campaign name. Like this
campaign1 - Google Ads - A1234567 - Other information (not at the beginning and a dash as a separator)
or
campaign1 Google Ads A1234567 Other information (not at the beginning and a space as a separator)
Thoughts on how to solve for this? I'm thinking the forumla needs to look for all instances where a string starts with "A" an is followed by a series of numbers which could be 7 or more followed by a space or something like that.
0 -
CASE WHEN REGEXP_LIKE(`field`, '^(A\d+).*$') THEN REGEXP_REPLACE(`field`, '^(A\d+).*$', '$1') ELSE 'unassigned' END
You can use a formula tile and a regular expression to pull out the values which start with A followed by some numbers.
^ - Start of the line
() - Match group to store the results found - in this case the first match group which is stored in the variable $1
A - the literal letter A
\d+ - One or more digits
.* 0 or more of anything
$ - The end of the string
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**3 -
With your description the Regex version would seem to be the best fit. Just keep in mind that you'd need to change this since you just mentioned the campaign code might be anywhere in the text, so something like this might do the trick: ^.*\b(A\d+).*$ in this case you don't care that the code starts. You can see the comparison on matches between the two versions here:
0 -
@marcel_luthi Very helpful. I was just informed today that they plan on dropping the first letter. Of course they needed to complicate this more. Can this be modified to find the string of numbers regardless of whether it starts with an A or not? And tRegex would be the regex to use in the CASE statement @GrantSmith supplied?
0 -
@GrantSmith I"m not sure I completely understand your CASE statement. This will create a new columns and put the extracted value in it? Does 'field' need to be replaced with the column/field name?
0 -
This would create a new field with the extracted value if it matches the logic, otherwise it's got a value of Unassigned.
`field` needs to be replaced with the source field / column which has your campaign name data.
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**1 -
I think this might suffice right for a RegEx that would capture any value that starts with and A or B followed by any number of digits or just a series of digits that is t least a minimum of 5?
^.\b(A\d+)|(B\d+)|(\d{5,}).$If the it is always a minimum of 5 digits then this might be even better or perhaps there is an even shorter version of this rule?
^.\b(A\{5,})|(B\{5,})|(\d{5,}).$
0 -
@GrantSmith This is the case statement I finalized on with some help from ChatGPT but this does not give me the expected result of returning just the match. Instead the entire campaign name is copied. What am I missing?
CASE WHEN REGEXP_LIKE(`name`, '(M\d{5,}|C\d{5,}|\d{5,})') THEN
REGEXP_REPLACE(`name`, '(M\d{5,}|C\d{5,}|\d{5,})', '$1')
ELSE
'unassigned'
END
0 -
Not sure if the version of MySQL currently used supports it, but you could try changing the REGEXP_REPLACE(`name`, '(M\d{5,}|C\d{5,}|\d{5,})', '$1') for REGEXP_SUBSTR(`name`, '(M\d{5,}|C\d{5,}|\d{5,})') (I know not all the MySQL REGEXP functions work, but it might be worth trying)
0 -
REGEXP_SUBSTR doesn't appear to be an available function
0 -
This is because your regex is searching for a mapping and then replacing that with the same thing it found. Because there is no REGEXP_SUBSTING you need to match the entire string and then it'll replace the entire string with your mapping.
You can do this with the ^ (start of string) and .* (any number of characters) and $ (end of string)
CASE WHEN REGEXP_LIKE(`name`, '^([MC]?\d{5,}).*$') THEN REGEXP_REPLACE(`name`, '^([MC]?\d{5,}).*$', '$1') ELSE 'unassigned' END
I've also simplified your regular expression
[] - Match any character in this group, in this case M or C
[]? - Match 0 or 1 of these characters in the group. This will handle M, C and no prefixes
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**0 -
Ah. Yes. I had to update it further based on @marcel_luthi's earlier answer where it would find the value anywhere in the campaign name and not just at the start.
Like this: ^.\b([MC]?\d{5,}).$
0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 296 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 614 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 729 Beast Mode
- 53 App Studio
- 40 Variables
- 677 Automate
- 173 Apps
- 451 APIs & Domo Developer
- 45 Workflows
- 8 DomoAI
- 34 Predict
- 14 Jupyter Workspaces
- 20 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 121 Manage
- 118 Governance & Security
- Domo Community Gallery
- 32 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive