Magic ETL

Magic ETL

Replacing Text in ETL

In my dataset, data can be entered in various formats, leading to multiple representations for the same individual data point. I'm utilizing the 'Replace Text' feature to standardize these names under one common identifier in my recursive ETL. However, this isn't replacing the existing names as intended. Instead, it's adding the last word of the text and creating a new name in my dataset. I have attached a picture of what is happing.

What I want:

image.png

What it is formatted as in some cases:

image.png

Is there a specific setting I need to click in order for this not to happen? Or maybe move the Replace Text tile to a new location within my recursive ETL?

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In

Best Answers

  • Answer ✓

    I would recommend using the formula tile and using a case statement. Something like this:

    CASE WHEN `city` IN('hon, HI','HI','Hawaii','Whatever else') THEN 'Honolulu, HI' ELSE `city` END

    If I solved your problem, please select "yes" above

  • Coach
    Answer ✓

    @TC1199A You could also use the following in the formula tile to only keep the characters before the second comma, if there are multiple:

    1. IFNULL(left(`city`,SUBSTRING_INDEX(`city`,',',2)-1),`city`)

Answers

  • Answer ✓

    I would recommend using the formula tile and using a case statement. Something like this:

    CASE WHEN `city` IN('hon, HI','HI','Hawaii','Whatever else') THEN 'Honolulu, HI' ELSE `city` END

    If I solved your problem, please select "yes" above

  • @ColemenWilson I will give this a try. would you recommend this for all text replacement? I have some other fields where i am seeing the same issue.

  • I generally use the formula tile over text replace in most applications. You have many more options for text replace because you have the full suite of MySQL text modification available to you. https://dev.mysql.com/doc/refman/8.0/en/string-functions.html

    If I solved your problem, please select "yes" above

  • Can you post some examples of your raw data before your replace text and also what your replace text tile settings look like? There may be a more programatic way of solving your problem with a regular expression.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Coach
    Answer ✓

    @TC1199A You could also use the following in the formula tile to only keep the characters before the second comma, if there are multiple:

    1. IFNULL(left(`city`,SUBSTRING_INDEX(`city`,',',2)-1),`city`)

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In