bucket courses by count of grade items, visualize count of courses in each bucket by department

I have a large data set with 29 columns and over a million rows. The rows are each grade item in a course. Courses with no grade items are listed as a single row with a null for the grade item. The data set also includes key details like instructor, semester, department, type of activity associated with the grade item, as well as associated attributes with these fields.

I'm looking to count grade items by course so that courses can be put into buckets based on defined thresholds of quantity of grade items. From there, I want to visualize this by department, showing the count of courses that are in each bucket for that department. It will be important to view things by department because there are over 200 departments and over 85,000 courses. This is one one visualization in a set of other visualizations I already have working.

I do have a beast mode of:

CASE
when `GradeObjectId` IS NULL then 'No grade items'
when (COUNT(`GradeObjectId`)) = 1 then '1 grade item'
when (COUNT(`GradeObjectId`)) < 5 then '2-4 grade items'
when (COUNT(`GradeObjectId`)) >= 5 then '5 or more grade items'
End

Some key fields in the data set are:

Field	Data Type	What is is
UserName	Text	Faculty User Name
CourseName	Text	Course Name
CourseCode	Text	Course Number
DeptCode	Text	Department Identifier
Semester_group	Text	Semester Identifier
GradeNames_grouped	Text	Name of grade item
ToolName	Text	Activity type associated with grade item
GradeObjectID	Number	Unique ID for grade item
OrgUnitID	Number	Unique ID for course

Where I am falling short is I can get a visualization that shows the bucket identifier by the course, but once I try to visualize it by department, it is counting things for the entire department (therefore all departments are tagged as buckets of 5). I've experimented with pivot tables, tables with heatmaps, and horizontal bars (to allow for a long scrolling page). I have also tried doing four beast modes to essentially have a column for each bucket something like:

[beast mode 1] 
CASE
when `GradeObjectId` IS NULL then 1
else 0
End

[beast mode 2] 
CASE
when (COUNT(`GradeObjectId`)) = 1 then 1
else 0
End

When used in a pivot table, it is not allowing me to sum things like the beast mode 2 column but is allowing me to sum beast mode 1 column. For things like the beast mode 2 column, it is showing more like a T/F for if this department has an item that meets that criteria.

One method I am debating on using is creating a new data flow based on the original flow, grouping by the course identifier and aggregating for count of grade item, then doing a join to bring back in all of the details I want to use in this one visualization back into the new data set. There are drawbacks. The major one is that each course may have multiple faculty and will have multiple types of grade items. I will lose the ability to report on those types of details if I aggregate (there will no longer be separate rows of the details). I can keep both data flows for different purposes, but then I have two data flows.

I wanted to see if anyone had any thoughts on how I could go about tackling this without creating a separate flow. I am flexible with the type of visualization used and am willing to experiment.

Find more posts tagged with

Comments

jaeW_at_Onyx

to tie it all up, it sounds like you want to do a visualization that counts the bucketed aggregation of a metric. Unfortunately you can't do that in Domo, because effectively you're asking anayzer to do a GROUP BY twice.

SELECT

Buckets

count(*)

FROM

(

SELECT

Course,

Department

CASE

WHEN count(*) BETWEEN 1 and 2 then '1-2 items'

...

ELSE 'no Items' END as Buckets

count(*)

FROM

table

GROUP BY

Course,

Department,

CASE

WHEN .. then '1-2 items'

ELSE 'no Items' END as Buckets

) as firstGroup

GROUP BY Buckets, Department

It's not possible in Analyzer.

What I would try.

You can try using the new Dataset View beta to create the subQuery where you produce 'firstGroup' but it may not work because Analyzer AND Dataset Views execute queries at runtime, and it's unclear to me how domo's query engine handles queries passed from analyzer to a dataset view.

If the first attempt fails, I'd ask Support to optimize the view, hoping that in doing so they materialize the data view. If the view is materialized it should be have the same as any dataset and therefore 'just work.'

If that fails, I'd consider using 'dataset copy' connector to force Domo to materialize my dataset view, which guaranteed will work.

WHAT OTHER PEOPLE WOULD DO.

If you have time, (no ETL pipeline constraints) just run a MagicETL. Magic2.0 should be VERY fast with the described requirement.

KatieBush

I ended up creating a second Magic ETL flow. I'll end up having two flows for each major metric (grades, content, quizzes, etc.).