Beast Mode "Share Calculation on DataSet" vs. Calculating in Data Flow

Hi all,

 

Are there performance advantages to using the Beast Mode "Share Calculation on DataSet" to create a metric for a DataSet compared to putting logic/formula in a DataFlow to have the metric added as a physical column on the DataSet?

 

Curious to learn others thoughts/approachs as to balancing when logic should be part of the underlying DataFlow or be handled a shared Beast Mode.

 

Thanks,


Samir

Tagged:

Best Answers

  • jaeW_at_Onyx
    jaeW_at_Onyx Coach
    Answer ✓

    Are there performance advantages to using the Beast Mode "Share Calculation on DataSet" to create a metric for a DataSet compared to putting logic/formula in a DataFlow to have the metric added as a physical column on the DataSet?

     

    Yes...beast modes are evaluated / calculated at runtime.  For small datasets, the impact will be trivial or non-noticeable;however as datasets get large (100+ mil rows) certain types of calculations will take longer to evaluate.

     

    if possible / reasonable, materialize transforms.  this will make managing beast modes easier.  that said, from the usability perspective, if you can surface the transform to business users, it makes metadata management slightly easier (assuming they can read basic SQL).

     

    also, keep in mind, calculated metrics like percents or ratios cannot be implemented at the dataset level b/c frequently they MUST be calculated at runtime in order to return the 'right' answer.

     

    if you have specific questions let me know!

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • jaeW_at_Onyx
    jaeW_at_Onyx Coach
    Answer ✓

    Sure @sdarba 

    I'll compare two majore use cases. 

    1) adding dimension attributes to a dataset

    2) add metrics to a dataset.

     

    -- Dimension Attributes --

     

    Some clients will build complex beast modes for categorizing data.  Consider:

     

    Case

    when lower(`campaign name`) like '%disney%' then 'Disney'

    when lower(`campaign name`) like '%universal%` then 'Universal'

    ...

    else 'Campaign Not Matched'

    END

     

    UPSIDE

    Beast modes like this are easy to manage / see because you just open the beast mode to understand why you're not getting the expected result. 

     

    DOWNSIDE

    Imagine you have the same beast mode deployed to 15 datasets and you add a new campaign.  Now you need to update 15 beast modes.

    From the technology side, imagine your dataset is multiple 100 millions of rows.  The more data you have the worse a transform with LIKE will function.  

     

    RECOMMENDATION

    If it's reasonable materialize the transform, make it part of the dataset (use a lookup table).

     

    --- Metrics --

    consider the- example of profit margin percent. sum(amount) - sum(cost) /sum(cost) 

    if you calculate profit margin percent on each row of your data  .02, .03, .07 etc.  if you were to add up the profit margin percent row, eventuallly that column would exceed 100% and you can't have more than 100% profit margin.  It is inappropriate to 'materialize the metric in the dataset'.

     

    THAT SAID

    consider the example of profit.  (sales - cost).  You COULD materialize that calculation because you CAN sum profit and get a sensical result.  

    THAT SAID, this type of basic math is something that Adrenaline will be good at even into the 100s of millions of rows, so it makes more sense to show the math to the users (in a beast mode) where they can understand the metadata (i.e. how profit margin is calculated).

     

    Hope that helps!

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Answers

  • jaeW_at_Onyx
    jaeW_at_Onyx Coach
    Answer ✓

    Are there performance advantages to using the Beast Mode "Share Calculation on DataSet" to create a metric for a DataSet compared to putting logic/formula in a DataFlow to have the metric added as a physical column on the DataSet?

     

    Yes...beast modes are evaluated / calculated at runtime.  For small datasets, the impact will be trivial or non-noticeable;however as datasets get large (100+ mil rows) certain types of calculations will take longer to evaluate.

     

    if possible / reasonable, materialize transforms.  this will make managing beast modes easier.  that said, from the usability perspective, if you can surface the transform to business users, it makes metadata management slightly easier (assuming they can read basic SQL).

     

    also, keep in mind, calculated metrics like percents or ratios cannot be implemented at the dataset level b/c frequently they MUST be calculated at runtime in order to return the 'right' answer.

     

    if you have specific questions let me know!

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • sdarba
    sdarba Member

    Thanks Jae!

     

    "if possible / reasonable, materialize transforms.  this will make managing beast modes easier.  that said, from the usability perspective, if you can surface the transform to business users, it makes metadata management slightly easier (assuming they can read basic SQL)."

     

    Do you mind expanding on that comment, I'm not 100% sure I'm following?

  • jaeW_at_Onyx
    jaeW_at_Onyx Coach
    Answer ✓

    Sure @sdarba 

    I'll compare two majore use cases. 

    1) adding dimension attributes to a dataset

    2) add metrics to a dataset.

     

    -- Dimension Attributes --

     

    Some clients will build complex beast modes for categorizing data.  Consider:

     

    Case

    when lower(`campaign name`) like '%disney%' then 'Disney'

    when lower(`campaign name`) like '%universal%` then 'Universal'

    ...

    else 'Campaign Not Matched'

    END

     

    UPSIDE

    Beast modes like this are easy to manage / see because you just open the beast mode to understand why you're not getting the expected result. 

     

    DOWNSIDE

    Imagine you have the same beast mode deployed to 15 datasets and you add a new campaign.  Now you need to update 15 beast modes.

    From the technology side, imagine your dataset is multiple 100 millions of rows.  The more data you have the worse a transform with LIKE will function.  

     

    RECOMMENDATION

    If it's reasonable materialize the transform, make it part of the dataset (use a lookup table).

     

    --- Metrics --

    consider the- example of profit margin percent. sum(amount) - sum(cost) /sum(cost) 

    if you calculate profit margin percent on each row of your data  .02, .03, .07 etc.  if you were to add up the profit margin percent row, eventuallly that column would exceed 100% and you can't have more than 100% profit margin.  It is inappropriate to 'materialize the metric in the dataset'.

     

    THAT SAID

    consider the example of profit.  (sales - cost).  You COULD materialize that calculation because you CAN sum profit and get a sensical result.  

    THAT SAID, this type of basic math is something that Adrenaline will be good at even into the 100s of millions of rows, so it makes more sense to show the math to the users (in a beast mode) where they can understand the metadata (i.e. how profit margin is calculated).

     

    Hope that helps!

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • sdarba
    sdarba Member

    Thanks Jae! That makes a ton of sense and I appreciate you adding the additional use case examples. 

     

    Appreciate all the help!