bucket courses by count of grade items, visualize count of courses in each bucket by department

I have a large data set with 29 columns and over a million rows. The rows are each grade item in a course. Courses with no grade items are listed as a single row with a null for the grade item. The data set also includes key details like instructor, semester, department, type of activity associated with the grade item, as well as associated attributes with these fields. 

 

I'm looking to count grade items by course so that courses can be put into buckets based on defined thresholds of quantity of grade items. From there, I want to visualize this by department, showing the count of courses that are in each bucket for that department. It will be important to view things by department because there are over 200 departments and over 85,000 courses. This is one one visualization in a set of other visualizations I already have working.

 

I do have a beast mode of:

 

CASE
when `GradeObjectId` IS NULL then 'No grade items'
when (COUNT(`GradeObjectId`)) = 1 then '1 grade item'
when (COUNT(`GradeObjectId`)) < 5 then '2-4 grade items'
when (COUNT(`GradeObjectId`)) >= 5 then '5 or more grade items'
End

 

 

Some key fields in the data set are:

Field

Data Type

What is is

UserName

Text

Faculty User Name

CourseName

Text

Course Name

CourseCode

Text

Course Number

DeptCode

Text

Department Identifier

Semester_group

Text

Semester Identifier

GradeNames_grouped

Text

Name of grade item

ToolName

Text

Activity type associated with grade item

GradeObjectID

Number

Unique ID for grade item

OrgUnitID

Number

Unique ID for course

 

Where I am falling short is I can get a visualization that shows the bucket identifier by the course, but once I try to visualize it by department, it is counting things for the entire department (therefore all departments are tagged as buckets of 5). I've experimented with pivot tables, tables with heatmaps, and horizontal bars (to allow for a long scrolling page). I have also tried doing four beast modes to essentially have a column for each bucket something like:

 

[beast mode 1] 
CASE
when `GradeObjectId` IS NULL then 1
else 0
End

[beast mode 2] 
CASE
when (COUNT(`GradeObjectId`)) = 1 then 1
else 0
End

 

 

When used in a pivot table, it is not allowing me to sum things like the beast mode 2 column but is allowing me to sum beast mode 1 column. For things like the beast mode 2 column, it is showing more like a T/F for if this department has an item that meets that criteria. 

PivotTable.png

 

One method I am debating on using is creating a new data flow based on the original flow, grouping by the course identifier and aggregating for count of grade item, then doing a join to bring back in all of the details I want to use in this one visualization back into the new data set. There are drawbacks. The major one is that each course may have multiple faculty and will have multiple types of grade items. I will lose the ability to report on those types of details if I aggregate (there will no longer be separate rows of the details). I can keep both data flows for different purposes, but then I have two data flows.

 

I wanted to see if anyone had any thoughts on how I could go about tackling this without creating a separate flow.  I am flexible with the type of visualization used and am willing to experiment.

Comments

  • to tie it all up, it sounds like you want to do a visualization that counts the bucketed aggregation of a metric.  Unfortunately you can't do that in Domo, because effectively you're asking anayzer to do a GROUP BY twice.

     

    SELECT 

    Buckets

    count(*)

    FROM

    (

    SELECT

    Course,

    Department 

    CASE

    WHEN count(*) BETWEEN 1 and 2 then '1-2 items'

    ...

    ELSE 'no Items' END as Buckets

    count(*)

    FROM

    table

    GROUP BY

    Course,

    Department, 

    CASE

    WHEN .. then '1-2 items'

    ELSE 'no Items' END as Buckets

    ) as firstGroup

     

    GROUP BY Buckets, Department

     

    It's not possible in Analyzer.

     

    What I would try.

    You can try using the new Dataset View beta to create the subQuery where you produce 'firstGroup' but it may not work because Analyzer AND Dataset Views execute queries at runtime, and it's unclear to me how domo's query engine handles queries passed from analyzer to a dataset view.  

     

    If the first attempt fails, I'd ask Support to optimize the view, hoping that in doing so they materialize the data view.  If the view is materialized it should be have the same as any dataset and therefore 'just work.'

     

    If that fails, I'd consider using 'dataset copy' connector to force Domo to materialize my dataset view, which guaranteed will work.

     

    WHAT OTHER PEOPLE WOULD DO.

    If you have time, (no ETL pipeline constraints) just run a MagicETL.  Magic2.0 should be VERY fast with the described requirement.

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • I ended up creating a second Magic ETL flow. I'll end up having two flows for each major metric (grades, content, quizzes, etc.).

This discussion has been closed.