Best Practice Create and Refresh Dataset


Hi all,


Anyone know about best practice about how to refresh dataset. I have 2 cases :

1. 6 hourly incremental (no update previous cycle data)

2. 6 hourly incremental (but there are possibility update previous cycle data)


Assume the data is big. Assume dailly data around 4 million. And I need historical data in the report, not only 1 day data.

Please share best practice about creating and refresh dataset.








  • AS

    That's a fair amount of data and it's good that you ask for help.  

    In the first case, if you're just appending the last 6 hours worth of rows (6/24 * 4m = 1m rows), Domo should be able to handle that, depending on the transport method.  Workbench can handle that. Some of the APIs might hit their limit if you're usign a cloud connector.

    In the second case, it probably depends on how far back in time some of your data might be changing. 


    We have a few datasets that are larger (for us), and the data is needed as real-time as possible (we need constant manufacturing monitoring) but we also don't want to put a performance drag on our production servers to send lots of data to Domo. So we split the data loads into separate jobs.  A small job for everything today which refreshes throughout the day, and a job for everything before today which job refreshes once daily.  Then we combine those datasets with a dataflow.  We put the processing burden onto Domo (or, Amazon, and let Domo pay for it),   That's one strategy you can take.  Or you can come up with something like it for your needs.


    Domo has a big data team that might be able to help you.  Contact support and see if they can get you in touch with them.  There is also a support team dedicated to workbench that could help you.

    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
This discussion has been closed.