Running Total

Options

Hi,

I have a question related to keeping a running total and then moving that running total to a new column called 'Most consecutive days' and I am not sure if I need to do this in a beast mode or MagicETL

Is there a way to, for example, product A sold for 10 days in a row, didnt sell a day then sold for 3 days in a row etc…. I would like to create table that shows product A and has the column names 'Current Consecutive Days' and then another column called 'Most consecutive days' and if the current consecutive days exceed 'Most consecutive days' then the new 'Most consecutive days' number is replaced by the 'Current Consecutive days' number.

Thanks in advance

Answers

  • GrantSmith
    Options

    Because you want to aggregate an aggregation you'll need to do it within Magic ETL.

    Start with the date dimension dataset and feed that into a formula tile to add a new column constant called Join Column with a value of 1.

    Take your original dataset feed it into a select columns to only select your product IDs and then into a remove duplicates to get a list of unique proudcts (alternatively you can use a products dataset if you already have this). Feed it into a formula tile and create a Join Column with a value of 1.

    Join the dates and product list together based on join column, This will have records for each day and product.

    Left join this to your original dataset so you'll get null values if your dataset doesn't have a quantity for a specific day. Feed it into a formula tile and use the following formula to set a flag if it's null or not:

    CASE WHEN `Qty` IS NULL THEN 1 ELSE 0 END
    

    Feed it into a Rank and Window Tile and do a running total across your entire dataset based on your flag field. This will give you a unique ID number for each time you have a run of days with product sales.

    Feed this into another Rank and Window tile to do a running total on your join column partition based on the new ID field to count the number of days within each run.

    Feed that into a group by based on the product field and take the MAX to get the most consecutive days. Join this back to your data from the rank and window based on the product ID.

    Now you'll have a list of all the dates and products and how many consecutive days there were sales.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • ColinHaze
    Options

    Can you explain the two rank and window tiles a little more? I seem to be getting stuck at this step.

  • ColinHaze
    Options

    For some reason I am getting 1.3Billion Output rows. When I put it in a table I am getting 4,177 consecutive days for all the products and 1.3billion for most consecutive days for all products