Magic ETL - Change column types and merge files

bdx
bdx Member

Hi, I am merging 7 data sets into one. I would like to change the column types on all 7 of the source files before merging into the output data set.

2 columns in each file are "text" columns, but I wish to change them to date type.

Some of the data contains "n/a" as a string. When I ran a simple Set Column Type ETL on this field, the "n/a" prevented the conversion.

I am applying a value mapper to both text columns containing the date (but with column type "text") to convert the "n/a" string to an arbitrary date (e.g. 1970-01-01).

Then I am applying the Set Column Type transformation to change those columns to column type date. After that I run an append rows transformation to complete the ETL and populate the Output file. See below a snippet of the Magic ETL.

When I run this, the ETL creates 2x the number of rows vs. the sum of the original 7 datasets.

Any ideas why this is and how I can achieve my goals (replace column types for 2 columns and combine all 7 sets)?


Many thanks

Best Answer

  • MarkSnodgrass
    Answer ✓

    Based on your screenshot, your dataset is doubling because you are sending the entire dataset to Value Mapper 2 and the entire dataset to Value Mapper 9. In order to avoid the duplication, move your Value Mapper 9 and and Set Column Type 2 to be after Set Column Type 9 so that your source data is not being split out.

    Also, you could do this work after your append your 7 datasets together so that you are not having to repeat this work for each dataset. This would make it easier to maintain.

    Finally, Magic ETL 2.0 will make this kind of work much easier because they have added a formula tile where you could do the replace and make it a date type all in one tile. You can talk to your CSM to see if that can be turned on in your instance. It is currently still in Beta.

    **Check out my Domo Tips & Tricks Videos

    **Make sure to <3 any users posts that helped you.
    **Please mark as accepted the ones who solved your issue.

Answers

  • MarkSnodgrass
    Answer ✓

    Based on your screenshot, your dataset is doubling because you are sending the entire dataset to Value Mapper 2 and the entire dataset to Value Mapper 9. In order to avoid the duplication, move your Value Mapper 9 and and Set Column Type 2 to be after Set Column Type 9 so that your source data is not being split out.

    Also, you could do this work after your append your 7 datasets together so that you are not having to repeat this work for each dataset. This would make it easier to maintain.

    Finally, Magic ETL 2.0 will make this kind of work much easier because they have added a formula tile where you could do the replace and make it a date type all in one tile. You can talk to your CSM to see if that can be turned on in your instance. It is currently still in Beta.

    **Check out my Domo Tips & Tricks Videos

    **Make sure to <3 any users posts that helped you.
    **Please mark as accepted the ones who solved your issue.
  • bdx
    bdx Member

    Thanks Mark. I am running this new structure now as a test.

  • bdx
    bdx Member

    Thanks for the help. Fixed