bucket courses by count of grade items, visualize count of courses in each bucket by department
I have a large data set with 29 columns and over a million rows. The rows are each grade item in a course. Courses with no grade items are listed as a single row with a null for the grade item. The data set also includes key details like instructor, semester, department, type of activity associated with the grade item, as well as associated attributes with these fields.
I'm looking to count grade items by course so that courses can be put into buckets based on defined thresholds of quantity of grade items. From there, I want to visualize this by department, showing the count of courses that are in each bucket for that department. It will be important to view things by department because there are over 200 departments and over 85,000 courses. This is one one visualization in a set of other visualizations I already have working.
I do have a beast mode of:
CASE when `GradeObjectId` IS NULL then 'No grade items' when (COUNT(`GradeObjectId`)) = 1 then '1 grade item' when (COUNT(`GradeObjectId`)) < 5 then '2-4 grade items' when (COUNT(`GradeObjectId`)) >= 5 then '5 or more grade items' End
Some key fields in the data set are:
Field | Data Type | What is is |
UserName | Text | Faculty User Name |
CourseName | Text | Course Name |
CourseCode | Text | Course Number |
DeptCode | Text | Department Identifier |
Semester_group | Text | Semester Identifier |
GradeNames_grouped | Text | Name of grade item |
ToolName | Text | Activity type associated with grade item |
GradeObjectID | Number | Unique ID for grade item |
OrgUnitID | Number | Unique ID for course |
Where I am falling short is I can get a visualization that shows the bucket identifier by the course, but once I try to visualize it by department, it is counting things for the entire department (therefore all departments are tagged as buckets of 5). I've experimented with pivot tables, tables with heatmaps, and horizontal bars (to allow for a long scrolling page). I have also tried doing four beast modes to essentially have a column for each bucket something like:
[beast mode 1] CASE when `GradeObjectId` IS NULL then 1 else 0 End [beast mode 2] CASE when (COUNT(`GradeObjectId`)) = 1 then 1 else 0 End
When used in a pivot table, it is not allowing me to sum things like the beast mode 2 column but is allowing me to sum beast mode 1 column. For things like the beast mode 2 column, it is showing more like a T/F for if this department has an item that meets that criteria.
One method I am debating on using is creating a new data flow based on the original flow, grouping by the course identifier and aggregating for count of grade item, then doing a join to bring back in all of the details I want to use in this one visualization back into the new data set. There are drawbacks. The major one is that each course may have multiple faculty and will have multiple types of grade items. I will lose the ability to report on those types of details if I aggregate (there will no longer be separate rows of the details). I can keep both data flows for different purposes, but then I have two data flows.
I wanted to see if anyone had any thoughts on how I could go about tackling this without creating a separate flow. I am flexible with the type of visualization used and am willing to experiment.
Comments
-
to tie it all up, it sounds like you want to do a visualization that counts the bucketed aggregation of a metric. Unfortunately you can't do that in Domo, because effectively you're asking anayzer to do a GROUP BY twice.
SELECT
Buckets
count(*)
FROM
(
SELECT
Course,
Department
CASE
WHEN count(*) BETWEEN 1 and 2 then '1-2 items'
...
ELSE 'no Items' END as Buckets
count(*)
FROM
table
GROUP BY
Course,
Department,
CASE
WHEN .. then '1-2 items'
ELSE 'no Items' END as Buckets
) as firstGroup
GROUP BY Buckets, Department
It's not possible in Analyzer.
What I would try.
You can try using the new Dataset View beta to create the subQuery where you produce 'firstGroup' but it may not work because Analyzer AND Dataset Views execute queries at runtime, and it's unclear to me how domo's query engine handles queries passed from analyzer to a dataset view.
If the first attempt fails, I'd ask Support to optimize the view, hoping that in doing so they materialize the data view. If the view is materialized it should be have the same as any dataset and therefore 'just work.'
If that fails, I'd consider using 'dataset copy' connector to force Domo to materialize my dataset view, which guaranteed will work.
WHAT OTHER PEOPLE WOULD DO.
If you have time, (no ETL pipeline constraints) just run a MagicETL. Magic2.0 should be VERY fast with the described requirement.
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"0 -
I ended up creating a second Magic ETL flow. I'll end up having two flows for each major metric (grades, content, quizzes, etc.).
0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 102 SQL DataFlows
- 626 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 753 Beast Mode
- 61 App Studio
- 41 Variables
- 692 Automate
- 177 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive