Average Basket Size

I am looking for a way to create an average basket size by date

 

Assuming I have data set that looks like this:

InvoiceDateSKUSold Price
111Jan 4Product A10
111Jan 4Product B20
112Jan 4Product C5
113Jan 4Product A10
113Jan 4Product B20
113Jan 4Product C5
114Jan 20Product A10
115Jan 20Product A10
115Jan 20Product C5
116Jan 20Product D45

 

In the example above the Average basket size during Jan 4 would be the average of invoices 111, 112, and 113: (30 + 5 + 35) / 3 = 23.333

Whereas on Jan 20, the average basket size was average of invoices (114,115,116): (10+5+45)/3 = 20

 

The goal would be to ultimately create a chart that will show what the average basket size is over time (likely grouping the individual dates by week), but before I get to that point, I assume I need to do a beastmode calculation to actually populate those values first

Best Answer

Answers

  • Good morning, 

     

    I believe you could do a standard bar chart. the Y axis would be your date field, and the X, your Sold Price. Then chage the Claculation to Average. It will probably try to SUM or Count it at first. Is that what you're looking for?

  • Hi,  a beastmode calculation would be

     

    Sum(`Sold Price`) / Count(Distinct `SKU`)    but, the problem with it would be at the moment of displaying the data , can be tricking since you will have to group by date ... the moment you add by SKU , the calculation will give you the same value as the sum of Sold Price.

     

    If I undestand well , you want to add a column with the Average Basket Size for the day to apply to all rows

    Like this Magic ETL shows :

    Screen Shot 03-25-19 at 01.05 PM.PNG

     

    If this is your desired outcome follow this qucik steps to build a magic ETL flow to:

     

    1) Group by date the sum of Sold Price and the Count distinct of SKUs

    Screen Shot 03-25-19 at 01.11 PM.PNG

     

    2) Use the calculator to create the Average Basket Size for the day

     

    Screen Shot 03-25-19 at 01.14 PM 001.PNG

     

    3) Join the calculation back to the original dataset using a left join , then select the columns you want to display in the dataset and generate a new output dataset

    Screen Shot 03-25-19 at 01.17 PM.PNG

     

     

     

     

     

    Domo Arigato!

    **Say 'Thanks' by clicking the thumbs up in the post that helped you.
    **Please mark the post that solves your problem as 'Accepted Solution'
  •  


    @Sweep_The_Leg wrote:

    Good morning, 

     

    I believe you could do a standard bar chart. the Y axis would be your date field, and the X, your Sold Price. Then chage the Claculation to Average. It will probably try to SUM or Count it at first. Is that what you're looking for?


    Unfortunately this wouldn't show the right numbers in my case.

    Reason being, each product purchased shows up on a separate line of the data. So in the example above, for january 4, it would average 10,20,5,10,20,5 = 11.67 instead of combining the 3 transactions there to arrive at a basket size average of (30,5,35) = 23.33. 

    Thanks for the suggestion though

  • gotcha, you're grouping by Date and Invoice. Sorry i missed that. I would recocommend the other suggestion about going through an ETL. 

  • Ha! So I messed up my original numbers, and just so happened to pick numbers that accidentally resulted in the exact same average basket size on both Jan 4 and Jan 20. Whoops!

     

    I looked in to instead using the Magic ETL flow and it works perfectly well on the per-day level.

    The challenge ends up being when you group more than two days in an actual chart /card

     

    InvoiceDateSKUSold Price
    1114-JanProduct A10
    1114-JanProduct B20
    1124-JanProduct C5
    1134-JanProduct A10
    1134-JanProduct B20
    1134-JanProduct C5
    11420-JanProduct A10
    11520-JanProduct A10
    11520-JanProduct C5
    11620-JanProduct D45
    11720-JanProduct E2
    11820-JanProduct E2
    11923-JanProduct E2
    12023-JanProduct E2
    12123-JanProduct E2
    12223-JanProduct E2
    12323-JanProduct E2


    So If I add a few new sales, (transaction 117 - 123)

    The avg basket size on Jan 4 ends up being 23.33 and the avg basket size on Jan 20 ends up being 14.8, and the average for Jan 23 is 2.

     

    The challenge ends up being that when I move this dataset output in to creating a Domo Card. If I try to group by Month, the average basket size ends up averaging those 3 subtotals, rather than all of the individual transactions. In other words, the average basket size reported by the domo card ends up being (23.33 + 14.8 + 2) / 3 = 13.38 whereas the ACTUAL average basket size would be the average of all the individual transactions. Averaging the sold prices below gives us a real average basket size of 11.85

    InvoiceSum of Sold Price
    11130
    1125
    11335
    11410
    11515
    11645
    1172
    1182
    1192
    1202
    1212
    1222
    1232

     

    It seems to me that there may not be a way to have a true calculation of basket size. my only thought would be if within beast mode there would be a way to write a more complicated mysql statement, but not sure if such a statement would be possible.

  • Ha! So I messed up my original numbers, and just so happened to pick numbers that accidentally resulted in the exact same average basket size on both Jan 4 and Jan 20. Whoops!

     

    I looked in to instead using the Magic ETL flow and it works perfectly well on the per-day level.

    The challenge ends up being when you group more than two days in an actual chart /card

     

    InvoiceDateSKUSold Price
    1114-JanProduct A10
    1114-JanProduct B20
    1124-JanProduct C5
    1134-JanProduct A10
    1134-JanProduct B20
    1134-JanProduct C5
    11420-JanProduct A10
    11520-JanProduct A10
    11520-JanProduct C5
    11620-JanProduct D45
    11720-JanProduct E2
    11820-JanProduct E2
    11923-JanProduct E2
    12023-JanProduct E2
    12123-JanProduct E2
    12223-JanProduct E2
    12323-JanProduct E2


    So If I add a few new sales, (transaction 117 - 123)

    The avg basket size on Jan 4 ends up being 23.33 and the avg basket size on Jan 20 ends up being 14.8, and the average for Jan 23 is 2.

     

    The challenge ends up being that when I move this dataset output in to creating a Domo Card. If I try to group by Month, the average basket size ends up averaging those 3 subtotals, rather than all of the individual transactions. In other words, the average basket size reported by the domo card ends up being (23.33 + 14.8 + 2) / 3 = 13.38 whereas the ACTUAL average basket size would be the average of all the individual transactions. Averaging the sold prices below gives us a real average basket size of 11.85

    InvoiceSum of Sold Price
    11130
    1125
    11335
    11410
    11515
    11645
    1172
    1182
    1192
    1202
    1212
    1222
    1232

     

    It seems to me that there may not be a way to have a true calculation of basket size. my only thought would be if within beast mode there would be a way to write a more complicated mysql statement, but not sure if such a statement would be possible.

  • what about two groupings in the ETL?

     

    group 1

    group by date, invoice, SUM of Sold Price

     

    Group 2 (grouping off of group 1)

    group by Date, average of Sum of Sold Price. 

  • Sorry, I'm not sure I understand exactly how two groupings would help me. Would you be able to elaborate?

  • Sweep_The_Leg
    Sweep_The_Leg Member
    Answer ✓

    see attached power point, i got to your 11.8. you'll have to sneek in a date operation. 

  • This will work for what I need. Looks like the answer is that you effectively need to hard-code the averages in to the data set through magic ETL before you apply it to a domo card. Was able to make this work for my data, and was able to set up domo cards that will accurately represent the average Monthly basket size. 

     

    Then, the domo card shows an average across the months I've selected - that average isn't 100% accurate, since it's the average of 3 averages. If one month had 100 transactions with average basket size of 100, and another month had 1 transaction with an average basket size of 50, the card would just show me an average of 75. Fortunately, the data sets i'm working with are large enough that I don't think I will have this issue.

     

    thanks for all your help!