What's the cap of dataset to start slowing down the dataflow?
Hi,
One of our dataflow is running super slow. The input dataset has 2.2 million rows and if I want to join a column to it or add a column with calculation, it could take over 10 hours to run and didn't finish running due to time out. Is there a cap of input dataset that would slow down the running? Thank you so much!
Best Answer
-
user04089,
Thank you for the context. The size of the 200 character columns could certainly contribute to the run time. Otherwise, there are some other likely causes of long run times. Do you have any joins? If so, do you have indexes prior to the joins to improve the processing time for those join steps? Here is a good resource for adding indexes if the above is true:
http://knowledge.domo.com?cid=optimizingdataflow
I look forward to hearing back from you about how that goes!
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1
Answers
-
user04089,
Domo doesn't have a hard limit for input DataSets. There are many things that could slow down a DataFlow, and many of those causes could be minimized or addressed through optimizations. Could you please provide more context?
- What type of DataFlow (SQL, Magic ETL, etc) are you using?
- How many columns do you have in the DataSet?
- What types of values do you have in the DataSet primarily? Long text values, for example, take up much more space than numeric columns.
We look forward to hearing back from you with some context to better understand your use case here.
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 -
Hi Darius,
Thank you for letting me konw there is no hard limit. I'm using SQL. There are 17 columns, and 6 of them are text, while the rest are numeric. Two of the text columns can take up to 200 characters. I guess they are probably the reason why the dataflow is running slow.
Cheers,
Luna
1 -
user04089,
Thank you for the context. The size of the 200 character columns could certainly contribute to the run time. Otherwise, there are some other likely causes of long run times. Do you have any joins? If so, do you have indexes prior to the joins to improve the processing time for those join steps? Here is a good resource for adding indexes if the above is true:
http://knowledge.domo.com?cid=optimizingdataflow
I look forward to hearing back from you about how that goes!
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 -
Hi Darius,
Thank you for the resource. I did add indexes before join, with that being said, I guess the text columns are the bumps. Thank you again!
Cheers,
Luna
1 -
user04089,
Here are a couple of suggestions to double check for causes of the slowness:
- Double check your indexes to ensure that every column used in your join condition is indexed (I have added conditions at times that increased my run times until I modified my indexes)
- Trim your strings for leading and trailing spaces so the data size stored in Domo is as efficient as possible
If you do the above and you still see issues, you may consider reaching out to Domo Support if you have access to that resource, otherwise, you may consider using a Data Fusion for simple joins or unions that don't require additional manipulation:
http://knowledge.domo.com?cid=usingdatafusion
Thank you for reaching out and have a fantastic week!
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 102 SQL DataFlows
- 626 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 753 Beast Mode
- 61 App Studio
- 41 Variables
- 692 Automate
- 177 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive