Best Practice: ETLs Vs SQL ETLs

We are trying to rebuild a tool Domo Tech built for us called Domo Stacker. Its main purpose is to aggregate many datasets together. Our current datasets are 250+ recursive ETLs that output a dataset that is growing (think: historical + appended new data) for four buckets of categorized data (four separate outputs when datasets are aggregated together).

  • Does anyone know the limitations in Domo Magic ETL and Domo SQL ETL when it relates to how many input datasets you can bring in?
  • If the either of the above is the better option, is this a performant solution?
  • Also, if anyone has done anything similar to this, is our best tactical approach to build the new "aggregator" tool in Domo or would it benefit us to build externally -> aggregate the datasets together -> bring into Domo via workbench of the API?

PS: If you are wondering what on earth are you doing? We use Quickbooks Enterprise Desktop and we are limited to using Workbench to create the four separate jobs needed for our data tables. We also have CDATA that could read and write QB data to SQL that we can leverage to add color to the second bullet point above.

I appreciate your help and advice in advance.


Humbly,


Isaiah Melendez

Answers

  • Isaiah,

    I have not heard of a limitation of the number of input datasets (or seen it in the documentation), but you could probably ask your CSM to research that for you to get a firm answer. From a management standpoint, I would suggest breaking your ETLs into smaller ETLs, so it is easier to troubleshoot if something goes wrong. Sifting through a single ETL with that much going on could prove to be quite challenging.

    From a performance standpoint, Magic ETL will outperform SQL ETL by a longshot. I had a SQL ETL that I converted to Magic ETL and the runtime went from 30 minutes to 30 seconds. I also like the notes feature in Magic ETL so I can comment on each tile and explain what its purpose is.

    Magic's limitations are around complex joins and looping. You can usually get around the complex join by adding a filter tile after your join remove records. The looping can be more challenging.

    If you haven't read this KB article on recursive dataflows in the new Magic ETL, this could be helpful as well.

    https://domohelp.domo.com/hc/en-us/articles/360057087393-Creating-a-Recursive-Snapshot-DataFlow-in-the-New-Magic-ETL

    Hope this helps.

    **Check out my Domo Tips & Tricks Videos

    **Make sure to <3 any users posts that helped you.
    **Please mark as accepted the ones who solved your issue.
  • To add my two cents to Mark’s answer: MySQL will load your datasets sequentially so the more datasets you have the longer it will take. Magic will load them in parallel / at the same time so it’s a lot faster. I try to use magic unless I need to do some very complex join. Even then I’ll have the MySQL do at little as possible for my complex job and then use the resulting dataset in a magic dataset.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Gents just to circle back on this, we ended up going with the append method and it worked fine for our needs. Much faster than Stacker (awkward look). :)

  • jaeW_at_Onyx
    jaeW_at_Onyx Coach
    edited March 2022

    @imelendez

    Stacker was built back in Magic 1.0 era.

    Magic 2.0 should outperform virtually every ETL engine EXCEPT Adrenaline Dataflows and in some cases Data Set Views.

    UNLESS you are doing something that a database is MUCH better at (GROUP BYs, Window Functions and JOINs).

    But even then, to opt for a SQL dataflow option, the volume of data would have to be so high that it outweighs the amount of time it takes to transfer and load the data into tables (and index) as per @GrantSmith


    especially with Magic 2.0+Partitioning almost all the 'enterprise tools' Stacker and Data Assembler should be superfluous

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"