Aggregate a Beast Mode

Options
Chris_Wolman
Chris_Wolman Contributor
edited July 2023 in Beast Mode

Hello,

I am working with a retail dataset and need to calculate beginning and ending inventories based on a user selected date range. The calculation is straight forward but the problem is aggregating the result so it displays on one row per item. The goal is a table card of items reporting beginning inventory, ending inventory, units, and dollars sold in the selected period.

Example calculation for beginning inventory:

case
when Date=min(Date) OVER (partition BY Store,Item No,Variant No) then sum(NAV_OnHand)
else 0
end

All attempts to aggregate the case statement fail. The problem is Domo displays a row for each date in the selected period and I know this is due to the evaluation of the date on each row.

Any ideas how I can keep this calculation dynamic based on user date selection. The only alternative I can think of would be to have a row for every item and every day with the beginning and ending inventory calculated in the dataflow. I would prefer not to do that for 100s of thousands of items unless there is no other way to provide this functionality.

Thank you,

Chris

Tagged:

Best Answer

  • marcel_luthi
    marcel_luthi Coach
    Answer ✓
    Options

    @Chris_Wolman without knowing what your data really looks like is hard for me to picture a full path to follow. In general if what you want is to be able to know what was the inventory at any given point in time (captured using a Variable as @MichelleH suggested), is to have your Raw data being the movements to inventory by date, so you'll have entries whenever things are sold and whenever you acquire new things to sell, and then doing a best mode that aggregates those for any date prior or on the provided variable:

    SUM(CASE WHEN movement date <= Variable date ANDmovement type= 'IN' THEN qty END) - SUM(CASE WHEN movement date <= Variable date ANDmovement type= 'OUT' THEN qtyEND)
    

    If you have 2 date variables for the Start and End you'd simply have two Beast Modes that tells you the inventory at each point in time, and the advantage of this is that you can also show how many IN movements were performed between the two dates vs how many movements out (I guess your inventory can go up if you acquire multiple units to sell during the period).

    So using your example table, you'd need to do an ETL that would transform that to:

    ITEM A 4/15/2023 IN 10
    ITEM A 5/5/2023 OUT 2
    ITEM A 5/6/2023 OUT 1

    You can go even more detailed if besides showing the direction of the movement, you also specify why, like if you have OUT because of sales, OUT because of damaged good, IN by purchases or IN by returns, so this model would be easy to scale to the business needs, but you need to figure out first how to get the individual movement list out (ETL would be your best bet if you don't already have a dataset with this information).

Answers

  • MichelleH
    Options

    @Chris_Wolman Typically, aggregation functions should be outside of the entire case statement in order to return a single row. However in this case the min date is also causing an issue because you would have an aggregate inside an aggregate (min inside a sum). Based on your use case, I would suggest setting up two date variables for your start and end dates so you can avoid the second aggregation with a beast mode like this:

    sum(case when `Date` = `Start Date (Variable)` then `NAV_OnHand` else 0 end)
    

  • Chris_Wolman
    Chris_Wolman Contributor
    Options

    Hi @MichelleH ,

    Variables will not work in this case because there may not be a row in the data for that start date. Example, user selects a start date of 5/1/2023 and end date of 5/15/2023. The earliest date available for the item is 5/5/2023. I would need the calculation to use the earliest date in this scenario which is why the comparison is against the min(date). If there was a beginning inventory for every item and every date, it would work. I can create the dates in the dataflow but my resulting dataset will be in the hundreds of millions rows, which is I was hoping to keep the calculation dynamic.

    Thank you,

    Chris

  • MichelleH
    Options

    @Chris_Wolman Can you provide some more information about the business case for why the starting inventory for an item that wasn't available until 5/5 would not be 0 for the period between 5/1 and 5/15?

  • Chris_Wolman
    Chris_Wolman Contributor
    Options

    Sure, here is an example of the data.

    The above illustration shows 10 units received on 4/15/2023. No sales for the item until 5/5/2023. The expected result is one row for Item A with beginning inventory = 10, ending inventory of 7, and sales of 3.

    This is why we are using a windowed function to get the beginning inventory. Not every item will have a row for the exact date range selected.

  • marcel_luthi
    Options

    So on your data, you have the current number of items in a field OnHand? If this is the case, this would seem an ideal scenario for a flex table instead, have you looked at this option and configured the columns to show First Value, Last Value and the change?

  • MichelleH
    Options

    I agree with @marcel_luthi I think a flex table or some other time series chart would be better for visualizing this.

  • Chris_Wolman
    Chris_Wolman Contributor
    Options

    Wouldn't the flex table give me the wrong beginning inventory because the 4/15/2023 row would be excluded when filtering 5/1-5/15 on the card?

  • MichelleH
    Options

    @Chris_Wolman The 4/15/2023 row would be excluded with those filters in all of these scenarios. I guess I'm still not clear on what questions you are trying to answer with this data.

  • marcel_luthi
    marcel_luthi Coach
    Answer ✓
    Options

    @Chris_Wolman without knowing what your data really looks like is hard for me to picture a full path to follow. In general if what you want is to be able to know what was the inventory at any given point in time (captured using a Variable as @MichelleH suggested), is to have your Raw data being the movements to inventory by date, so you'll have entries whenever things are sold and whenever you acquire new things to sell, and then doing a best mode that aggregates those for any date prior or on the provided variable:

    SUM(CASE WHEN movement date <= Variable date ANDmovement type= 'IN' THEN qty END) - SUM(CASE WHEN movement date <= Variable date ANDmovement type= 'OUT' THEN qtyEND)
    

    If you have 2 date variables for the Start and End you'd simply have two Beast Modes that tells you the inventory at each point in time, and the advantage of this is that you can also show how many IN movements were performed between the two dates vs how many movements out (I guess your inventory can go up if you acquire multiple units to sell during the period).

    So using your example table, you'd need to do an ETL that would transform that to:

    ITEM A 4/15/2023 IN 10
    ITEM A 5/5/2023 OUT 2
    ITEM A 5/6/2023 OUT 1

    You can go even more detailed if besides showing the direction of the movement, you also specify why, like if you have OUT because of sales, OUT because of damaged good, IN by purchases or IN by returns, so this model would be easy to scale to the business needs, but you need to figure out first how to get the individual movement list out (ETL would be your best bet if you don't already have a dataset with this information).

  • Chris_Wolman
    Chris_Wolman Contributor
    Options

    Thank you @marcel_luthi and @MichelleH . I believe the use of variables for start and end date along with identifying inventory movement is the solution that will allow us to calculate the beginning and ending inventories for the period any user selects.