Regex Command in Magic ETL for cutting the end of a group of URL strings

I've seen a few similar topics but I was unable to know for sure...

 

I am new to Domo and was attempting to cut the latter part of several URL's.

 

For example...

 

.../safety-organization

.../safety-organization?cid=adv...

.../safety-organization?utm_=...

 

So in this case the latter parts of the URL's are email marketing tracking and I want the data to combine them all to look like the non-tracked version (the first URL). My thought was to make a Regex command saying to find the "?" and remove that and everything following it.

 

In Magic ETL I did a Replace Columns and the Regex command I used was \?(.*) after looking online at Regex documentation and then left the 3rd box blank because I didn't want it replaced with anything. Am I on the right track? As far as looking at the data there were many more instances of that URL so obviously it did something. I have noticed there are several versions of Regex depending on the language of the system so I would appreciate any feedback on my specific question as well as advice going forward on the type of syntax needed in Domo (similar to javascript, php, etc.)

 

Thanks!

Comments

  • Billobi
    Billobi Contributor

    You're definitely on the right path.  When I mock up the \?(.*) command, it snips off the ? character and anything to the right of it from the URL, which I think is your desired outcome.  This sort of ETL won't collapse any of your rows or give you counts of the various URL names, you'd need to do any grouping or counting separately.

     

    It's good practice to copy the column with the full URLs to a duplicate column that you then perform the RegEx on...that way you can quickly evaluate the success of your RegEx.  (To do that in ETL, Add Constants to add a column of NULL values, then Set Column Value to fill that NULL column with the URL you want to inspect).  Be sure you're using the "Use RegEx" operator in Replace Text, that's an easy one to forget about too.

     

    regex.jpg