DataSet Scheduling Hub

Options
NWolf
NWolf Domo Employee
edited August 2023 in Datasets Ideas

It would be nice to have a centralized place to view and manage the scheduling for all existing DataSets.

If I have answered your question, please click “Yes” on my comments option.

Tagged:
6
6 votes

In Review · Last Updated

Comments

  • ArborRose
    Options

    I have suggested this as a serious need. Your wording is much better than mine. :)

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • Garrett_Kohler
    Options

    One thing I've found useful is to create a monitoring dashboard or page utilizing the DS_DomoStats_Datasets and DS_DomoStats_Dataflows datasets that I believe are publically available.

    Creating a table with the scheduling information and hyperlinks out to configuration pages is the best I've been able to put together.

  • JoshMillheim
    JoshMillheim Domo Product Manager
    Options

    @NWolf @ArborRose - can you tell me more about your use case? Are you trying to make bulk changes for all datasets so that they start running at the same time? Is this a time savings thing, or more thing where you are trying to orchestrate all points of your data pipeline to be in sync?

  • ArborRose
    Options

    Let me first state how excited I am that someone is looking at this.

    Imagine the company manages stores. We don't, but pretend we do. All the stores have sales transactions each night. And those transactions are customer purchases. Every night you need to pull all the data from the stores and centralize the data. To analyze it, you need to gather the product codes, etc. One data pull may be the sales transactions. Another might be a customer list with addresses and demographic information. Another might be a product list with sku codes, pricing, etc.

    All those transactions are centralized and we make API calls for each table. Those tables take time to pull and there are may be several dozen each day. Some may take 6-10 hours to pull. Sometimes when these API calls overlap, they work. Some can't overlap or fail. For the customer list, I can't give it a parameter to shorten the dataset. I must pull all customers who have ever shopped at a store. If any job stops or conflicts, we don't get the data that day unless its a short one. Because of the duration, it might be that we can't run it again in that day because we don't have enough open room.

    What am I looking for?

    Imagine graphically setting up a project in project management software. You can see the handoffs from one task to another. I want something similar related to my Domo scheduled tasks.

    I don't care about flows that are triggered by tasks. Just the API calls that have a set schedule to run. Best method would be if I could set the type of calls I want to view. Such as workbench versus the normal interface. I want something that will show me my daily scheduled jobs, not when it runs, when it is scheduled to run. And how long its taking. So that I can move things around within the hours of the day.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • DataMaven
    Options

    It would be great to be able to actually manage the schedules from an interface, since it's a lot of clicks otherwise.

    @ArborRose - It seems like what you are looking for can be achieved using the Datasets History Dataset, joined to the Datasets Dataset. Then use the Gantt chart to show the histories and durations of the runs. I did a test, with a filter of yesterday, and it worked.

    DataMaven
    Breaking Down Silos - Building Bridges
    **Say "Thanks" by clicking a reaction in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • ArborRose
    Options

    @DataMaven - Yes, I'd like to manage the schedules. But first, get an accurate account of what is scheduled. One of the problems with using the history, is that it shows what ran, not schedule settings. I see Domo running things far off the time I have set.

    I'm interested to see what you come up with in your Gantt view. If I set each value to one hour, using something like DATE_SUB(Last_Run_Date, INTERVAL 1 HOUR), I would get a stack of rows containing small rectangles. This wouldn't show me the accurate start or duration. I'd have to pull more data for that.

    The above are just a sample of the schedule I'd be looking at. And all of these are on different rows, rather than a single row (most of these run consecutively).

    My domain is set to universal standard time because we could not get it working with DOMO on my time zone when the APIs aren't translated to UTC (I don't control the APIs btw). So what you see in the chart times is actually 6 hours ahead of my actual time. I'd have to subtract six from the field value to view it on my time zone times.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **