Display Row count at every step in Magic ETL

RobynLinden
RobynLinden Coach
edited December 2023 in Magic ETL Ideas

Currently if you want to know your row count in Magic ETL at any stage other than the out put, you have to place a few temporary tiles to get there. I.E. add a rank and window tile to establish a Row Number column then aggregate with a group by to get the max row.

Would really love if each tile could show the rows contained within, so if you're losing or blowing up data it's easier to find where the changes are happening. Would save a TON of time.

I realize there may be limitations because Magic can only preview 400k rows - so just like on a table card if you're over the limit of data that can load, it would be fair to say so. I'd rather be limited to how much data I can know about than not get any row counts at all.

Broadway + Data
Tagged:
49
49 votes

In Review · Last Updated

This is a great idea and one that has been on my mind, as a Magic user myself. I have it on my "wishlist", but after seeing the idea here and additional feedback on the thread, I will review it again with my engineering team.

Comments

  • @RobynLinden like ... just in the preview? or as an output dataset? would you want it as part of the execution details page?


    in the immediate term, I just always tell people to use a GROUP BY or REMOVE DUPLICATES tile before they do a JOIN and that solves any row growth problems... unless there's a NULL in the column. Don't join on NULL.

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Right, in the preview - maybe right here.


    I want to know how many rows are contained in the step. So if I have 1000, I blow it up to 2000 on a join, but then land with 500 due to a group by -- just tell me that in the header of each tile after I run a preview. If 500 is what I wanted, then I'm happy.


    Broadway + Data
  • Love this idea! I always have to put output datasets just to check row count when I am troubleshooting to make sure I did not blow up a join.

    **If this answer solved your problem be sure to like it and accept it as a solution!

  • This would also be helpful to see as an additional column in the run history so we can compare "Rows Processed" to "Output Rows" at each step.

  • This is a great idea, Domo is silly for not implementing this sooner

  • AndreaHenderson
    AndreaHenderson Domo Product Manager

    Love this idea! I have it on my own "wishlist", and in the roadmap backlog. Reading all this feedback, though, I'm going to make sure to surface it during my next roadmap conversations.

    Domo Product Manager for Data Transformation (MagicETL)

  • Thank you @AndreaLovesData !!

    Broadway + Data
  • This came up today in a working session, someone specifically wanted to see the row count at each step!

  • I was coming here today to literally type the exact same idea with the exact same screenshot. This would be a game changer for it to have this on each tile so you know how many rows it would output. And be sure the row count preview number is all rows. Even though the preview only shows 100 rows. If it goes over the number you've queried, lets say you only querry 10K, then have it say 10K + and if it is under 10K then have it give the number of 8645 rows, or whatever it would be.

  • This would make a massive ease-of-use difference for me.
    I've previously had some experience using an ETL called Alteryx which does this really well.

    Here is a clip from Youtube showing how they do it.

    I would suggest providing the option to overlay it on the ETL graph for the last successful run.