Rolling Average for Each Category
I have a dataset containing total monthly units by month and by person. I'd like to calculate the 3 month rolling average for each person. How should I group the data so that the rolling average calculation goes by person and not by date?
Answers
-
This is how my dataset looks so far
0 -
Hi @JR_S1L
You'd need to keep your data as is to partition based on name to get the lag for the prior 3 months to get your rolling average. You can use a rank & window tile in Magic ETL to calculate the lag just make sure you're partitioning by the name and that you're sorting/ordering by the month
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**1 -
Hi @GrantSmith ....I am new to lag functions, can you explain a bit further? This is what I have so far when calculating the lag using Magic ETL...how would I then calculate the rolling average?
0 -
I've done a writeup of lag functions and rolling averages before utilizing Beast Modes but the idea at the end is the same concept. You just need to calculate the Lag for 1 month ago and 2 months ago into three separate fields. Then add all 3 together (current month, 1 month ago and 2 months ago) and divide the total by 3 using a formula tile. That will get your rolling 3 month average.
This is depending on how you want to calculate the rolling three months. Is that the current month and the prior 2 months or the prior 3 months excluding the current month? If you want the prior 3 months just calculate the lag for 3 months ago and add that value instead of the current value.
If you're interested here's the writeup using a beast mode (but if your rolling average won't need to have filters dynamically applied to it I'd recommend keeping it in the magic etl): https://dojo.domo.com/discussion/52679/domo-ideas-conference-beast-modes-rolling-averages#latest
**Was this post helpful? Click Agree or Like below**
**Did this solve your problem? Accept it as a solution!**1 -
+1 on Grant's solution using Lag(1) .. Lag(2), Lag(3) IF YOU DON"T HAVE GAPS IN YOUR DATA.
Only thing I strongly dislike about this approach is that you have to create 3 functions (not efficient) AND it must be true that there are no gaps in data.
Computationally, @GrantSmith i would say this is probably the only good use case for MySQL b/c it allows you to calculate LAG without the assumption of contigusous data (cc. @MarkSnodgrass
NOTE THIS IS ALL UNTESTED CODE there are probably missing commas, and when you set variables (as opposed to display you must use := and variable names are case sensitive.
if you write
SELECT t.* , r.* FROM table t , (SELECT @person := '' , @date := null FROM ) r ORDER BY t.Date
You are CROSS APPLYing a generated table with one row (r) to every row of table t where table r contains variables person and date which are insantiated as blank and null respectively
from there if you write
SELECT @date as prevDate, t.*, @date := t.date as _setDateVariable FROM table t , (SELECT @person := '' , @date := null FROM ) r ORDER BY t.Date
when MySQL reads this function it will process each column of my SELECT statement in order (top to bottom). so when if i was on row 3 of my table...
prevDate is the value of assigned to @date (which happens to be the value of t.date from row 2 -- i'll explain in a second), then it reads all the values for that row of table t, then it assigns @date the value of t.date for the current row (row 3)
when i SELECT row 4...
prevDate is the value of @date (row 3) , t.* is all the contents of row 4, then i assign ( using := ) the value of t.date (row 4) to @date.
and so on.
implement partitioning and test for previous day
SELECT -- execute tests. testing @person implements my Partition clause -- testing @lag1Date ensures the no gaps rule (this is the key part that Magic doesn't give you) case when t.person = @person and @lag1Date = date_add(t.date, interval -1 day) then @lag1Value end as prevDayValue, t.*, -- paritioning assumes an ORDER BY clause -- OR reset my variables to null @lag1Value := case when t.person = @person then t.value else null end as _setLag1Value, @lag1Date := case when t.person = @person then t.date else null end as _setLag1Date, @person := t.person as _setPersionVariable FROM table t , (SELECT @person := '' , @date := null , @lag1Value:=null, @lag1Date:=null, FROM ) r ORDER BY t.person, t.Date
Implement Cascading testing
SELECT -- execute tests. testing @person implements my Partition clause -- testing @lag1Date ensures the no gaps rule (this is the key part that Magic doesn't give you) case when t.person = @person and @lag1Date = date_add(t.date, interval -1 day) then @lag1Value end as prevDayValue, case when t.person = @person and @lag2Date = date_add(t.date, interval -2 day) then @lag2Value end as prevDayValue, t.*, -- cascade the values of lag1 to lag2 or reset to null @lag2Value := case when t.person = @person then @lag1Value else null end as _setLag2Value, @lag2Date := case when t.person = @person then @lag2Value else null end as _setLag2Date, @lag1Value := case when t.person = @person then t.value else null end as _setLag1Value, @lag1Date := case when t.person = @person then t.date else null end as _setLag1Date, @person := t.person as _setPersionVariable FROM table t , (SELECT @person := '' , @date := null , @lag1Value:=null, @lag1Date:=null, @lag2Value := null lag2Date:=null FROM ) r ORDER BY t.person, t.Date
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 -
Thinking it through another set based way of approaching this without variables would be to do what SQL has to do under the covers (generate windows) for each row in your dataset we'll generate the 4 rows, then we'll assign the value of the transaction for those lagged rows if they exist, then we'll aggregate.
this assumes we have a numbers table, sys_num, that contains integers between 1 and 3 in a column called n.
NOTE this is untested code but i think it works.
P.S. this code you could implement in Magic if you wanted to avoid writing SQL. it would perform faster since we didn't index MySQL.
duplicate (using CROSS APPLY) the rows in t.table 4 times
CREATE TABLE referenceTable as SELECT distinct date_add(t.date, interval n day) as report_date t.person t.value as lag_value n.num as lag_num t.date as actual_transaction_date FROM table t , (SELECT num from sys_num where num beteen 0 and 3) n
create your windowed data
for each row in table t, you should have duplicate rows if table has a matching 0, 1, 2, or 3 day offset (report_date).
CREATE TABLE blowOut as SELECT t.person, , t.date, , r.lag_num , r.lag_value , case when lag_num = 1 then r.lag_value as lag1day end , case when lag_num =2 then r.lag_value as lag2day end , case when lag_num=3 then r.lag_value as lag3day end , case when lag_num=0 then r.lag_value as lag0day end FROM table t JOIN referenceTable r on t.person = r.person and t.date = r.report_date
deduplicate rows
create table lag_table as SELECT max(lag_1day) as lag1day, max(lag_2day) as lag2day, max(lag_3day) as lag3day, max(lag4_day as lag4day, date, person from blowOut GROUP BY date, person
then join in the data for that person and that row
select t.* <lt.columns> from table t inner join lag_table lt on t.person = lt.person and t.date = lt.date
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 -
@GrantSmith , @jaeW_at_Onyx thank you both so much!!! I realized it's because we don't have window functions enabled in our domo instance that this was proving impossible for me...I've reached out to our rep to have them enabled so that I can apply the options that you provided! Again, thank you!!
0 -
@GrantSmith , @jaeW_at_Onyx just to confirm...in order to be able to dynamically filter by name, these calculations need to be done in beast mode correct?
0
Categories
- All Categories
- 1.9K Product Ideas
- 1.9K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 302 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 104 SQL DataFlows
- 637 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 761 Beast Mode
- 65 App Studio
- 42 Variables
- 702 Automate
- 182 Apps
- 458 APIs & Domo Developer
- 52 Workflows
- 10 DomoAI
- 39 Predict
- 16 Jupyter Workspaces
- 23 R & Python Tiles
- 401 Distribute
- 116 Domo Everywhere
- 277 Scheduled Reports
- 8 Software Integrations
- 132 Manage
- 129 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 111 Community Announcements
- 4.8K Archive