How to split all words in a row, separated by a comma, and append them. The result should be only on

How to split all words in a row, separated by a comma, and append them. The result should be only one column with only one word per row?

Tagged:

Answers

  • Hi @sangelov - when you say all words in a row, do you mean across multiple fields or a single field? The words in the field(s) are comma separated, is that correct? If the result should be only one column with a single word, what is being appended?

  • I think there's a split column tile in Magic ETL. You could try creating an ETL, split column using comma as the delimiter. Then use an unpivot columns tile which would turn them into rows under one column.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • sangelov
    sangelov Member

    My main question is, after I split this column by comma and create 13 new columns with only a single word/product in a cell, how can I stack them together to get one column with one word/product per a row?

  • Here's an example. Using the following sample data

    id,word_list
    1,"apple,banana,cherry"
    2,"grape,orange"
    3,"mango,peach,plum,kiwi"
    4,"strawberry"

    I uploaded this as "Sample csv content".

    image.png

    Then I build an ETL.

    image.png

    In the split column tile:

    image.png

    In the unpivot tile:

    image.png

    My final dataset:

    image.png

    If I had only one row, I would end with only one column. This example assumes you want to work with many rows since you mention something about appending. I assume you are asking about appending the continuation of words as more rows.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • I'd recommend using a python tile, so that you don't have to assume or maintain a specific number of splits

    # Import the domomagic package into the script
    from domomagic import *
    
    # read data from inputs into a data frame
    df = read_dataframe('<INPUT TILE NAME>') #change <INPUT TILE NAME>
    
    # write your script here
    
    # Split the CSV string into lists
    df['csv_split'] = df['<CSV COLUMN NAME>'].str.split(',') #change <CSV COLUMN NAME>
    
    # Explode the lists into separate rows
    df_exploded = df.explode('csv_split').reset_index(drop=True)
    
    # write a data frame so it's available to the next action
    write_dataframe(df_exploded)