Magic ETL

Magic ETL

Sum of Distinct Count Variance

Hi

 

I need help on below beast mode not working.  I am trying to get the sum of all the variance using distinct count of customers.  I am trying to get my total points of distribution gained. Thank you in advance.

 

SUM(COUNT(DISTINCT
CASE WHEN `Year`= 2020 AND `Month #` <=`Month # for Today`
THEN `Customer Name`
END
))
-
SUM(COUNT(DISTINCT
CASE WHEN `Year`= 2019 AND `Month #` <=`Month # for Today`
THEN `Customer Name`
END
))

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In

Best Answer

  • Coach
    Answer ✓

    So ... while window functions are amazing for many things, it won't work when you're trying to do a count distinct.

     

    why?  

     

    SUM(

    COUNT(DISTINCT

    CASE WHEN `Year`= YEAR(CURRENT_DATE()) AND `Month #` <=`Month # for Today` THEN `Customer Name`
    END
    ))
    OVER ()

     

    keep in mind what is happening under the hood.  first Domo does a COUNT(DISTINCT) of customer names, for whatever is on the Axis.  (set of items) 

    then it adds up all the results.  

    so if i had item 1 = 15 distinct customers, item2  = 12 distinct customers and item3 = 31 distinc customers, i can't just sum 15,12,31... because presumeably i have overlap in the customers who bought items.

     

    SOLUTION

    APPEND your data onto itself (UNION) but change the Item number in the second UNION to be a constant 'Total Items'.

     

    now you can do a basic count(distinct customer) and you'll have a 'Total Items' ... item.

     

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Answers

  • Hi @user033690 

     

    You might be able to solve your problem using a windowing function however I'd recommend restructuring your data. I've done a previous write up regarding customized period over period data models. You can refer to https://dojo.domo.com/t5/Card-Building/show-percent-on-period-over-period-graph/m-p/50540/highlight/... for an in dept description of what needs to happen.

     

    The basics being you'd have a period type column for Current and Last Year (which you'd filter on just these two) which you can then use a beast mode to calculate the total for that period and then subtract the two numbers. I'd highly recommend this method as it gives you a lot more flexibility in your analysis.

     

     

    If you must use a beast mode you could try something like:

    1. SUM(COUNT(DISTINCT
    2. CASE WHEN `Year`= YEAR(CURRENT_DATE()) AND `Month #` <=`Month # for Today`
    3. THEN `Customer Name`
    4. END
    5. ))
    6. OVER ()
    7. -
    8. SUM(COUNT(DISTINCT
    9. CASE WHEN `Year`= YEAR(CURRENT_DATE())-1 AND `Month #` <=`Month # for Today`
    10. THEN `Customer Name`
    11. END
    12. ))
    13. OVER ()

    I've also replaced 2020 and 2019 with YEAR(CURRENT_DATE()) functions to automatically calculate the current year and last year (hence the -1) so you don't have to worry about changing the year value next year and in future years. (Unless you specifically want 2020 and 2019 then use those values)

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Thank you for the response Grant.  

     

    I tried the BM you suggested. unfortunately it gives me a different result.  Please review the table below.  I am looking to get the FY20-FY19 variance both by item and as a total.  I hope you can help me figuring this out.  Thank you in advance.

     

    Item NumberFY20 Customer CountFY19 Customer CountBM resultShould be result
    1012B96848912
    1042B9183898
    10449084896
    1022B80698911
    1014B6764893
    1092B5852896
    Total48243653446
  • Coach
    Answer ✓

    So ... while window functions are amazing for many things, it won't work when you're trying to do a count distinct.

     

    why?  

     

    SUM(

    COUNT(DISTINCT

    CASE WHEN `Year`= YEAR(CURRENT_DATE()) AND `Month #` <=`Month # for Today` THEN `Customer Name`
    END
    ))
    OVER ()

     

    keep in mind what is happening under the hood.  first Domo does a COUNT(DISTINCT) of customer names, for whatever is on the Axis.  (set of items) 

    then it adds up all the results.  

    so if i had item 1 = 15 distinct customers, item2  = 12 distinct customers and item3 = 31 distinc customers, i can't just sum 15,12,31... because presumeably i have overlap in the customers who bought items.

     

    SOLUTION

    APPEND your data onto itself (UNION) but change the Item number in the second UNION to be a constant 'Total Items'.

     

    now you can do a basic count(distinct customer) and you'll have a 'Total Items' ... item.

     

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In