MySQL Workflow

Can Domo's mysql handle more than 500k records? It shows the import can only go to 500k but I'm assuming that is for previewing data not actual processing. I attempted to output only five rows from a million record dataset and it ran for 30 minutes before I cancelled the process.

** Was this post helpful? Click Agree or Like below. **
** Did this solve your problem? Accept it as a solution! **

Comments

  • @ArborRose Yes, the 500k limit is just referring to previewing the data. According to this best practices article, MySQL dataflows are best suited for transformations where the inputs are under 100 million rows. It's not out of ordinary to have MySQL run times beyond 30 minutes, but you should be able to optimize the dataflow by using indexes, reducing the number of joins, and only selecting the necessary rows and columns in each transform step.

    https://domo-support.domo.com/s/article/360042935434?language=en_US#3.4.

  • Thank you MichelleH.

    The entire data flow was an input table (12 million rows) and a single SELECT where id=XXXX with a limit of 5. I tried a second time to do the output without the limit but for a single person's records and it ran for 25 minutes before I cancelled. That output would have been less than a hundred rows.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • @ArborRose Although you are limiting your rows in the transform, Domo's SQL-based dataflows still need to load all rows of the input datasets before executing any transforms, which explains why it is taking so long. For something as simple as what you're trying to do, I'd recommend using a Dataset View or MagicETL instead for better performance.

  • The select limit was just a test command to identify whether it could see beyond 500k rows. The process I need to create is much more complex.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • @ArborRose That makes sense. You can also limit which columns you need to load from your input dataset to reduce the amount of time to takes to load.

    That said, I've found that MagicETL v2 can handle some fairly complex transformations and is generally more efficient than MySQL. Ever since Domo upgraded from the previous version of MagicETL, the only times I still need to use SQL dataflows are for use cases with complex join criteria.