Split columns based on number of characters using regex

Hi all,

I need to split a column that contains both a client name and number into 2 separate columns. Some client names themselves contain whitespace, so I cannot split on whitespace and am looking at Regex, though it's been a while since I've used it. All the client numbers are 5 digits long, though they are randomly generated and do not start with any particular digit. I want them to split as such:

Current

Test Company 30467

Test Company1 67389

—>

Future Client | Future Client ID

Test Company | 30467

Test Company1 | 67389

I have tried to split on .{5}$, though this has been unsuccessful.

Thanks in advance.

Tagged:

Best Answers

  • MarkSnodgrass
    Answer ✓

    I would do the following in Magic ETL with the formula tile and create them in this order:

    Client ID - RIGHT(clientfield,5)

    Client Name - TRIM(REPLACE(clientfield,Client ID,'')

    If you extract the client ID first, which you know is always the last 5 characters of the string, you can then use the replace function to replace it with nothing and then use the trim function to remove any extra spaces, which will leave you with the client name.

    **Check out my Domo Tips & Tricks Videos

    **Make sure to <3 any users posts that helped you.
    **Please mark as accepted the ones who solved your issue.
  • GrantSmith
    GrantSmith Coach
    Answer ✓

    @MarkSnodgrass 's solution will work but here's a regex version for you:

    You can use a formula tile and a REGEX_REPLACE

    Client Name

    REGEXP_REPLACE(`clientfield`, '^(.*) ?(\d+)$', '$1')
    

    Client ID

    REGEXP_REPLACE(`clientfield`, '^(.*) ?(\d+)$', '$2')
    

    A breakdown:

    ^ - Start of the string

    (…) - Defines a match group to store the results in for later

    . - match any character

    * - match 0 or more of the preceding character defined

    (space)? - match 0 or 1 space

    \d - match a digit

    + - match 1 or more of the preceding character defined

    $ - match the end of the string

    $1 - Returns the text captured with the first set of parenthesis.

    $2 - Returns the text captured with the second set of parenthesis. - The digits

    In simpler terms - match anything followed by a space, followed by one or more digits.

    If you want to match exactly 5 digits then you can use {5} instead of the + as you mentioned above.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • MarkSnodgrass
    Answer ✓

    I would do the following in Magic ETL with the formula tile and create them in this order:

    Client ID - RIGHT(clientfield,5)

    Client Name - TRIM(REPLACE(clientfield,Client ID,'')

    If you extract the client ID first, which you know is always the last 5 characters of the string, you can then use the replace function to replace it with nothing and then use the trim function to remove any extra spaces, which will leave you with the client name.

    **Check out my Domo Tips & Tricks Videos

    **Make sure to <3 any users posts that helped you.
    **Please mark as accepted the ones who solved your issue.
  • GrantSmith
    GrantSmith Coach
    Answer ✓

    @MarkSnodgrass 's solution will work but here's a regex version for you:

    You can use a formula tile and a REGEX_REPLACE

    Client Name

    REGEXP_REPLACE(`clientfield`, '^(.*) ?(\d+)$', '$1')
    

    Client ID

    REGEXP_REPLACE(`clientfield`, '^(.*) ?(\d+)$', '$2')
    

    A breakdown:

    ^ - Start of the string

    (…) - Defines a match group to store the results in for later

    . - match any character

    * - match 0 or more of the preceding character defined

    (space)? - match 0 or 1 space

    \d - match a digit

    + - match 1 or more of the preceding character defined

    $ - match the end of the string

    $1 - Returns the text captured with the first set of parenthesis.

    $2 - Returns the text captured with the second set of parenthesis. - The digits

    In simpler terms - match anything followed by a space, followed by one or more digits.

    If you want to match exactly 5 digits then you can use {5} instead of the + as you mentioned above.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**