Dataflow Run Control - Execute on 1st of month only

I have a large dataset that is refreshed on a daily basis.  I have cards, however, that I would like to stay static during the month and only reflect the end-of-month value(s).  

 

I could:

  • create another workbench job to upload a second version of the data based on a different schedule
  • I could create a manually executed workflow that makes a copy of the main dataset once/month, and NOT set it to automatically update, but then I have to login and run it on weekends and holidays if that's where the start of the month is

 

What I'd really like to do is figure out if there is a way to create a smart dataflow, that can test for the date, and only run when it's the first of the month.  In straight TSQL, it might look something like this...

 

BEGIN

     IF DAY(CURDATE()) <> 1 THEN

               RETURN;

          ELSE

               SELECT *

               FROM sourcetable

END

 

Any thoughts on how to do this in a DATAFLOW, or, even better, am I just missing some control capability that's already available?

**Say thank you by clicking the 'thumbs up'
**Be sure to select the answer that represents the best solution and mark as "Accept as Solution"

Best Answer

  • AS
    AS Coach
    Answer ✓

    Hi Craig

    I've seen people create a dataset solely to trigger dataflows before.  They set the workbench update schedule to be the dataflow schedule they want, and then create the dataflow with only that one dataset as the trigger mechanism.  That way whatever happens to the other input datasets, it won't reflect in the cards until workbench runs and kicks off the dataflow update.

    Does that make sense?

    The drawback is that you just have some otherwise lame dataset in the datacenter, so it adds a little bit extra to manage.

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Answers

  • Hi all,

    Can anybody help out @ckatzman?

    Thanks!

  • ckatzman
    ckatzman Contributor

    @DaniBoy, is there a dojo protocol for being able to reach out for help from someone at DOMO when the Dojo has been unable (either no one has seen the question, or no one knows the answer) to provide a solution?  Is that just a matter of redirecting the inquiry to DomoSupport, and if so, is there a way (an @name) to be able to loop in DomoSupport rather than having to do so separately via email or phone call?

     

    Thanks,

    Craig

    **Say thank you by clicking the 'thumbs up'
    **Be sure to select the answer that represents the best solution and mark as "Accept as Solution"
  • AS
    AS Coach
    Answer ✓

    Hi Craig

    I've seen people create a dataset solely to trigger dataflows before.  They set the workbench update schedule to be the dataflow schedule they want, and then create the dataflow with only that one dataset as the trigger mechanism.  That way whatever happens to the other input datasets, it won't reflect in the cards until workbench runs and kicks off the dataflow update.

    Does that make sense?

    The drawback is that you just have some otherwise lame dataset in the datacenter, so it adds a little bit extra to manage.

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • ckatzman
    ckatzman Contributor

    That's an interesting idea, Aaron! Saying it back to make sure I understood your methodology.  Please correct me if I misspeak here...

     

    Workbench job with a dummy dataset, leveraging the workbench scheduler to only run once/month.  Then attach that dummy dataset in some innocuous way to the primary dataset within a workflow, triggered to only refresh when the dummy dataset is updated.  

     

    So while the primary dataset may update/refresh on a daily basis, the workflow is going to create a version of the primary dataset once/month, but then that's going to be static the remainder of the month so long as the dummy dataset is not updated.

     

     

    **Say thank you by clicking the 'thumbs up'
    **Be sure to select the answer that represents the best solution and mark as "Accept as Solution"
  • Yes.  The dummy dataset serves only as a trigger.  It is an input to the dataflow, but its data is unused.  The data comes from the primary dataset only.  Dataflows don't care whether the inputs are actually used, so we can take advantage of that "feature".

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • @ckatzman we are working on improving the process for threads that can not get solved in Dojo. In the short term please feel free to use the notify moderator link on the lower left of each post to let us know a particular thread needs Domo support engaged.

     

    We do a weekly review of all unanswered threads in addition to those that have a lot of engagement but no solution. We bring in Domo support  to the threads that need attention but appreciate the extra eyes in the community on ones we may not catch.

     

    Thanks!

  • I read this, because I was also looking for a trigger fo "1st of month". I reecognized, that some connectors have that feature. So instead of setting up a workbench Job running on a (virtual) machine or elsewhere with the risk of not being active at that moment, I have scheduled a Google Sheet Dataset with static data (I had one anyway) to update at the first of a month and used that one within the dataflows as trigger to run.

     

    Just another approach on the same solution.

     

    Cheerio,

    Dimitri

  • I think that is not the case anymore? At least I get an error when trying to save a data flow that has an input that does not result in any actions... 
    How can I make an Input data set "irrelevant" within a dataflow concerning the data within the ouput at least?