Pulling out UTM parameters for new columns

NathanDorsch
NathanDorsch Member
edited March 2022 in Magic ETL

I'm working with Google Analytics 4 data and am noticing some gaps with the campaign metrics. So, what I'd like to do is reference the page URL to look up UTM parameters and identify. In case you aren't familiar with UTM parameters, the way it works is you'll add a '?' and add these to your URL. Later in GA, you can tell the various different sources of traffic.

So for example, if I want to send users on LI to my example.com website with display ads for my NewCool campaign, I would hyperlink this URL:

example.com?utm_campaign=NewCool&utm_medium=display&utm_source=Linkedin

Is there a way create new columns (Campaign, Medium, Source, Content) that look up these strings and return the values following them? Below is an actual example - as you can see there are more types of parameters than the ones I want and they aren't always in the same order...

utm_source=facebook&utm_medium=cpc&utm_content=K9102FR&utm_campaign=K9102FR&fbclid=IwAR3RIyAvOwcSmyHRt2ni2_UQg8FzNK8Gd1knQmIS8wre9z5IFZoClc6n99Q

So, Source is facebook, medium is cpc, content is K9102FR and campaign is K91-2FR

Answers

  • Hi @NathanDorsch

    Regular expressions are your friend here for parsing this out of your URL string. You'd need different columns but you can do this all in the same Formula tile using different regular expressions:


    utm_source:

    REGEXP_REPLACE(`url`, '^.*utm_source=([^&]+).*$', '$1')
    

    utm_content:

    REGEXP_REPLACE(`url`, '^.*utm_content=([^&]+).*$', '$1')
    

    utm_campaign:

    REGEXP_REPLACE(`url`, '^.*utm_campaign=([^&]+).*$', '$1')
    

    utm_medium:

    REGEXP_REPLACE(`url`, '^.*utm_medium=([^&]+).*$', '$1')
    

    fbclid:

    REGEXP_REPLACE(`url`, '^.*fbclid=([^&]+).*$', '$1')
    


    By default if the regular expression doesn't match it'll return the original value. You could wrap these in CASE statements and using the REGEXP_LIKE function to conditionally return an empty string if it's not found.

    utm_medium:

    CASE WHEN REGEXP_LIKE(`url`, '^.*utm_medium=([^&]+).*$') THEN REGEXP_REPLACE(`url`, '^.*utm_medium=([^&]+).*$', '$1') ELSE '' END
    


    Regex Breakdown

    ^ - Start at the beginning of the string

    .* - Match any character . 0 or more times *

    utm_medium= Exactly match the string 'utm_medium='

    ([^&]+) Parenthesis tell the regex to store the result into a variable ($1 in this case as it's the first match defined in our regular expression) [^&] tell the regular expression to not ^ match the & character 1 or more times +

    $ Match the end of the string


    By matching the beginning and end of the string and using the replace it will replace the entire string with the value we found in our first match defined in $1

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**