Beast Model for Calculating Average of Values for lines which meets certain conditions
Hi Everyone,
I have looked through several posts to see if this was already discussed, but couldn't didn't quite find what I was looking for.
I have a dataset that contains a date column representing the week for which the data is for. I would like to calculate the average of values within a 6 week period from the original date that meets certain conditions. I would like to calculate this for every row in my dataset.
Example:
For data in row 1: I would like to get the average of values from rows 2 - 7 (not because it is the sequential row number but because those are the weeks within a 6 week period), excluding row 4 and 7 (because Event field= Y).
For data in row 2: I would like to get the average of values from rows 3 - 8, excluding row 4 and 7 in the calculation
...
For data in row 8: I would get the average of values from rows 9 - 14, no exclusions since all these records for Event field = N
Number | Location | Product | Sales | Week | Event? | Average | |
1 | 123 | ABC | 100 | 14-Oct-2018 | N | 111.25 | =AVERAGE(D4,D5,D7,D8) |
2 | 123 | ABC | 60 | 21-Oct-2018 | N | 116.25 | =AVERAGE(D5,D7,D8,D10) |
3 | 123 | ABC | 200 | 28-Oct-2018 | N | 88.75 | |
4 | 123 | ABC | 560 | 4-Nov-2018 | Y | 91 | =AVERAGE(D7,D8,D10,D11,D12) |
5 | 123 | ABC | 110 | 11-Nov-2018 | N | ||
6 | 123 | ABC | 75 | 18-Nov-2018 | N | ||
7 | 123 | ABC | 600 | 25-Nov-2018 | Y | ||
8 | 123 | ABC | 80 | 2-Dec-2018 | N | 96.16667 | =AVERAGE(D11,D12,D13,D14,D15,D16) |
9 | 123 | ABC | 90 | 9-Dec-2018 | N | ||
10 | 123 | ABC | 100 | 16-Dec-2018 | N | ||
11 | 123 | ABC | 110 | 23-Dec-2018 | N | ||
12 | 123 | ABC | 85 | 30-Dec-2018 | N | ||
13 | 123 | ABC | 88 | 6-Jan-2019 | N | ||
14 | 123 | ABC | 104 | 13-Jan-2019 | N | ||
15 | 123 | ABC | 90 | 20-Jan-2019 | N |
Can someone provide some guidance on how I might be able to approach this in beast mode? Thanks!
Comments
-
This kind of calculation would need to be handled from inside a dataflow. Are you always going to want to look at the trialing 6 week average (where event='N')?
What do you do when you don't have 6 weeks of trailing data? for example, if the data is from three weeks ago, I would only have 2 weeks of trailing data, do I take the average of those two weeks? Or would I leave it blank?
“There is a superhero in all of us, we just need the courage to put on the cape.” -Superman0 -
What is your recommendation for handling it within a dataflow? Can I manage it via MagicETL?
Are you always going to want to look at the trialing 6 week average (where event='N')? Yes, we will always look at the next 6 weeks data, but only take those rows where the event = 'N'. So if I end up with 2 weeks - I will only average the value of those 2 weeks.
For those records without 6 weeks of trailing data, I will keep them as blank as we don't have sufficient data to do the calculation.
0 -
Last question before I get working on this. Are you looking to aggregate this per location? Per product (company wide)? or per product and location? or are you looking for company wide metrics (across all products and locations)?
Because the calculation is being done in the dataflow, it forces the agregation to be static.
“There is a superhero in all of us, we just need the courage to put on the cape.” -Superman0 -
Aggregated at a location...
0 -
First of all... yes, you can do this in an ETL. However, I am not the best at creating some of these more complex dataflows via ETL. So I'm sure that there are probably more elegant ways of performing this within an ETL, but this is the best I could do. I decided to keep the product detail in the output dataset because I imagined that would be one of the follow up questions, "What products are driving these trends?"
I will post my steps here, let me know if you get lost anywhere along the way (apologies in advance because, as I mentioned, this got a little sloppy on me)
Overview
Data Prep
Data Joins
Calculations
Here is the RegEx that I used:
^(\s|\S)*(\S)+(\s|\S)*$
Output
You end up with a data set like this:
I used a beastmode for the average sales:
sum(`Sales over next 6 with no event`) / max(`Number of weeks with no Event`)
Hope that helps. This was a fun exercise in forcing myself to use ETL
“There is a superhero in all of us, we just need the courage to put on the cape.” -Superman0
Categories
- All Categories
- 1.7K Product Ideas
- 1.7K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 292 Workbench
- 4 Cloud Amplifier
- 8 Federated
- 2.8K Transform
- 95 SQL DataFlows
- 602 Datasets
- 2.1K Magic ETL
- 3.7K Visualize
- 2.4K Charting
- 695 Beast Mode
- 43 App Studio
- 39 Variables
- 658 Automate
- 170 Apps
- 441 APIs & Domo Developer
- 42 Workflows
- 5 DomoAI
- 32 Predict
- 12 Jupyter Workspaces
- 20 R & Python Tiles
- 386 Distribute
- 111 Domo Everywhere
- 269 Scheduled Reports
- 6 Software Integrations
- 113 Manage
- 110 Governance & Security
- 8 Domo University
- 30 Product Releases
- Community Forums
- 39 Getting Started
- 29 Community Member Introductions
- 98 Community Announcements
- Domo Community Gallery
- 4.8K Archive