Creating Percentages for table groupings


I am trying to create a percentage for my dataset that will give me the percent of observations in multiple levels of groupings. So if my data has 2 levels of grouping, then the table would look like this:


group1     group2   count     percent_Grouping1(A or B)     percent_Grouping2(AA,AB,BA,BB)

   a               a           10               33.33%                                     45.45%

   a               a             4               13.33%                                     18.18%

   a               a             8               26.66%                                     36.36%

   a               b             2                6.66%                                        25%

   a               b             6                 20%                                          75%

   a               b             0                  0%                                            0%

   b               a             9                 18%                                        27.27%

   b               a           13                 26%                                        39.39%

   b               a           11                 22%                                        33.33%

   b               b             7                 14%                                        41.18%

   b               b             6                 12%                                        35.29%

   b               b             4                  8%                                         23.53%


I am looking for a way to get the percent of records in that level of grouping. In my actual data I have 5 levels of grouping that I need to consider, but I think this examples get my need across. I have been trying to write a beast mode caluculation to run this, but I haven't found a working solution. Any help as either a beastmode calculation or something else would be greatly appreciated.


  • Valiant

    So you'll need to actually get the counts for the possible groups created via an ETL/transform and added in as constants to your dataset. That way you can do something like :

    CASE WHEN `group1` = 'a' AND `group2` = 'a' THEN MAX(`aaTotal`)
    WHEN `group1` = 'a' AND `group2` = 'b' THEN MAX(`abTotal`)
    WHEN `group1` = 'b' AND `group2` = 'a' THEN MAX(`baTotal`)
    WHEN `group1` = 'b' AND `group2` = 'b' THEN MAX(`bbTotal`)

    Let me know if you have questions on setting this up.




    **Please mark "Accept as Solution" if this post solves your problem
    **Say "Thanks" by clicking the "heart" in the post that helped you.