Split a single column into multiple rows based on a delimiter

I have a column with a variable number of items, delimited by a comma.

 

For example: column_a: 1,2,3,4

 

Number of elements is variable - it can be a single element, or 50 (while I can set an upper limit, it can be dozens).

 

I need a way to split the data in the column into rows, so with the above example I'd like to have:

Input: column_a: 1,2,3,4

Output:

new_column

1

new_column

2

new_column

3

new_column

4

 

With Magic ETL I'm not able to do that, nor with SQL (again, without creating a very large and ugly SQL query to span maximum 50 elements).

 

Is there any other way other than scripting it externally to Domo? I have this requirement for several datasets and it starts to become an issue.

Comments

  • AS
    AS Coach

    Using MySQL dataflows you should be able to create a stored procedure to help you loop through the values and create more rows.

     

    This stack overflow post would be a good starting point:

    https://stackoverflow.com/questions/17308669/reverse-group-concat-in-mysql

     

     

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • I'm trying to do the same thing and found this: https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Splitting_a_String_into_Multiple_Rows

     

    I have tried with my data and have run both sets of SQL code and clicked apply as per the instructions. What I cannot find though is the output table 'final' nor the ability to save the data flow for future use. Can anyone advise? Big thanks in advance.

     

    Screenshot 2019-01-23 at 12.17.48.png

  • You'd want to do a statement like this in the output:

    select * from final

     

    Once there's on output you can save the dataflow.

     

    The knowledge base article leaves it open ended as what to do next:
    "You can now use the “final” table in subsequent transforms, including Output DataSets transforms."

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Thankyou, it works perfectly now!

  • Hi anyone out there. I've noticed that i'm missing out on contact data because the column is being truncated. I think it's something to do with the varchar size.. can anyone please guide me as to how to increase this and for DOMO to understand? I've tried putting larger numbers in but it doesn't seem to help. 

     

    Any assistance much appreciated. 

     

    Cheers,

     

    Danny


    @DannyLewis wrote:

    Thankyou, it works perfectly now!


     

  • Nek
    Nek Member

    hello, thanks all for the hints and sharing!

     

    however, I have no idea why the 2nd SQl(CALL string_split_procedure) does not work on my case (attached the screenshots for reference) - that shows: 

    Whoops! Something went wrong.

    after running the SQL for few minutes

    appreciate for any advice and solution to the above, thanks a lot!

     

    regards, Yau

  • @DannyLewis - It looks like @AS solved your initial question, but it isn't marked as solved.  Would you mind giving that nice little green checkmark?  

     

     

    DataMaven
    Breaking Down Silos - Building Bridges
    **Say "Thanks" by clicking a reaction in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • @Nek I wonder if you found a solution for this

    Been trying to use the dataflow.

    I was also getting the same error for the 2nd SQl call function (CALL string_split_procedure)

     

    "(Whoops! Something went wrong.)" 

    Let the thread know how to resolve this if you have the solution.

     

    Regards,

    Harsha

  • hey friends ... wow super old thread!  But this trick is a good one.

     

    Here's a solution in Redshift that may work a little faster because the MySQL proc does not use indexing and these string operations get expensive.

    https://stackoverflow.com/questions/46784721/redshift-split-single-dynamic-column-into-multiple-rows-in-new-table/46785509

     

    Also, if your data source is salesforce or any of the prebuilt connectors, you gotta work with what you've got.  If your data source is a custom connector, rewrite your connector to do the string split for you.  SQL is generally TERRIBLE at these types of string operations, whereas Python and JavaScript will be much faster at this type of transform.

     

    That said for TROUBLESHOOTING MySQL Stored Procedures...

    Keep in mind this is just MySQL.  So figure out what's broken.  

     

    1) are you executing the Call properly?

    https://stackoverflow.com/questions/16157349/how-to-call-a-mysql-stored-procedure-with-arguments-from-command-line

     

    If yes, the stored proc takes a while to run.  So instead of trying to SELECT* FROM FINAL try SELECT * from one of the intermediate tables.

     

    If you dissect your stored proc, you'll see that it has multiple CREATE table arguments but no test to see if it already exists.  This means you can only call your stored proc ONCE.  If you try calling it again, it will fail because your proc will try to CREATE a table that already exists and then error out.

     

    SIDE NOTE.  To test if you're calling the stored proc correctly, take all the guts out of the CREATE PROCEDURE statement.  Then try calling it.  If it works, then reload the page, throw all the guts back in and try again.  If it fails, slowly pare back the code until you figure out the line that breaks.

     

     

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"