What's the cap of dataset to start slowing down the dataflow?

user04089
user04089 Member
edited March 2023 in Datasets

Hi,

 

One of our dataflow is running super slow. The input dataset has 2.2 million rows and if I want to join a column to it or add a column with calculation, it could take over 10 hours to run and didn't finish running due to time out. Is there a cap of input dataset that would slow down the running? Thank you so much!

 

Best Answer

  • Darius
    Darius Domo Employee
    Answer ✓

    user04089,

     

    Thank you for the context. The size of the 200 character columns could certainly contribute to the run time. Otherwise, there are some other likely causes of long run times. Do you have any joins? If so, do you have indexes prior to the joins to improve the processing time for those join steps? Here is a good resource for adding indexes if the above is true:

     

    http://knowledge.domo.com?cid=optimizingdataflow

     

    I look forward to hearing back from you about how that goes!

     

    Regards,


    Darius Rose
    **Say “Thanks” by clicking the “heart” in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Answers

  • Darius
    Darius Domo Employee

    user04089,

     

    Domo doesn't have a hard limit for input DataSets. There are many things that could slow down a DataFlow, and many of those causes could be minimized or addressed through optimizations. Could you please provide more context?

    • What type of DataFlow (SQL, Magic ETL, etc) are you using?
    • How many columns do you have in the DataSet?
    • What types of values do you have in the DataSet primarily? Long text values, for example, take up much more space than numeric columns. 

    We look forward to hearing back from you with some context to better understand your use case here.

     

    Regards,


    Darius Rose
    **Say “Thanks” by clicking the “heart” in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Hi Darius,

     

    Thank you for letting me konw there is no hard limit.  I'm using SQL. There are 17 columns, and 6 of them are text, while the rest are numeric. Two of the text columns can take up to 200 characters. I guess they are probably the reason why the dataflow is running slow. 

     

    Cheers,

    Luna

  • Darius
    Darius Domo Employee
    Answer ✓

    user04089,

     

    Thank you for the context. The size of the 200 character columns could certainly contribute to the run time. Otherwise, there are some other likely causes of long run times. Do you have any joins? If so, do you have indexes prior to the joins to improve the processing time for those join steps? Here is a good resource for adding indexes if the above is true:

     

    http://knowledge.domo.com?cid=optimizingdataflow

     

    I look forward to hearing back from you about how that goes!

     

    Regards,


    Darius Rose
    **Say “Thanks” by clicking the “heart” in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Hi Darius,

     

    Thank you for the resource. I did add indexes before join, with that being said, I guess the text columns are the bumps. Thank you again!

     

    Cheers,

    Luna

  • Darius
    Darius Domo Employee

    user04089,

     

    Here are a couple of suggestions to double check for causes of the slowness:

    • Double check your indexes to ensure that every column used in your join condition is indexed (I have added conditions at times that increased my run times until I modified my indexes)
    • Trim your strings for leading and trailing spaces so the data size stored in Domo is as efficient as possible

    If you do the above and you still see issues, you may consider reaching out to Domo Support if you have access to that resource, otherwise, you may consider using a Data Fusion for simple joins or unions that don't require additional manipulation:

     

    http://knowledge.domo.com?cid=usingdatafusion

     

    Thank you for reaching out and have a fantastic week!

     

    Regards,


    Darius Rose
    **Say “Thanks” by clicking the “heart” in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"