Aggregating Data for a Beastmode - Counting Survival Analysis
Hi there -- I have a dataset that has an individual record per employee that states if they are active or terminated and that gives their current tenure or the tenure at the time that they left the company.
I'm trying to do a survival beastmode using a bar chart where I show 100% at tenure = 0, and then at tenure = 1, I want to show the proportion that has left the company, into tenure = 2, etc.
I have the calculated tenures and have it rounded down and labeled, but am having trouble accomplishing the above. Is there any way to do a beastmode where I show the survival / retention likelihood to making it to 1-year of tenure, 2-years, etc. where the horizontal access is years (through 5 years) and the y-axis shows survival based on the total population and dropping off those that don't make it to that next threshold.
This is my beastmode so far for survival:
The yellow highlight is my pre-existing beastmode that reflects current tenure (if active, time from hire to load, if terminated, time from hire to term). At time 0, I want the survival to be 1 (or 100%), then I want to subtract from 1, the proportion of departures that left with < 1 year of tenure. However, this does not appear to be working especially for trying to iterate this to 2, 3, 4, + years.
Code also copied below (instead of an image):
CASE WHEN floor((CASE WHEN Termination Date
is null then DateDiff(ETL_DATE
,Last Hire Date
) else DateDiff(Termination Date
,Last Hire Date
) end) / 365.25) = 0
THEN
1
WHEN floor((CASE WHEN Termination Date
is null then DateDiff(ETL_DATE
,Last Hire Date
) else DateDiff(Termination Date
,Last Hire Date
) end) / 365.25) = 1
THEN
1 - (SUM (CASE WHEN ((CASE WHEN Employee Status
!='TERMINATED' then DateDiff(ETL_DATE
,Last Hire Date
) else DateDiff(Termination Date
,Last Hire Date
) end) / 365.25) < 1 AND Employee Status
= 'TERMINATED' THEN 1 ELSE 0 END) / SUM (CASE WHEN ((CASE WHEN Employee Status
!='TERMINATED' then DateDiff(ETL_DATE
,Last Hire Date
) else DateDiff(Termination Date
,Last Hire Date
) end) / 365.25) >= 1 THEN 1 ELSE 0 END))
Any advice on how to make this happen while still remaining interactive (removing certain orgs, focusing on certain countries, etc. via filters)? Much appreciated.
Best Answer
-
In this case, there are 2 possible paths you can take:
If drill down capabilities are needed, then you'll need to transform your data using ETL and create a beast mode that have copies of each persons entries for each tenure year in which they should be counted for. This required a bit more set up but reduces the number of BeastModes you'll need to achieved the desired result.
If drilling is not needed, then you can create several BeastModes, one per year and metric you want to achieve, as you'd need the same row of data in your dataset to be counted towards multiple aggregations. As an example, for Tenure X you'd have:
Active (Basically will only return the ID if that employee had been active for at least X years)
CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) THEN `ID` END
Terminated in period (those that made it to the tenure, but hae been terminated before reaching the next)
CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) AND `Last Hire Date` + INTERVAL (X + 1) YEAR >= `Termination Date` THEN `ID` END
Then your Survival % for X can be calculated as:
1-COUNT(CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) AND `Last Hire Date` + INTERVAL (X + 1) YEAR >= `Termination Date` THEN `ID` END) / COUNT(CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) THEN `ID` END)
If you only care about the % you can just create 5 of the last formulas and call it a day. Using the following as an example:
Then the calculation above will look as follows on a table (did up to term 3, to demonstrate)
1
Answers
-
Can you share an image (even if is a mock table in a spreadsheet) of what your ideal outcome would look like? Most likely, based on your description you'll need to use 1 formula per tenure year.
If I'm following properly, lets say there have been 100 employees in the company all time and the current count by tenure are: Year 0 15, Year 1 15, Year 2 30, Year 3 30, Year 4 10. Would you say your percentages need to be: Year 0 100%, Year 1 85%, Year 2 70%, Year 3 40%, Year 4 10%? Or do you need to account that the tenure for each year should only be based on the people that could have been in the company for that long, so if someone started in Jan 2023, you only count him for Tenure 0 and 1, since it hasn't been 2 years yet so there is no way he can contribute to later tenures?
0 -
Thank you!
Using that simplified example, in Year 0 it would be 100% (all 100 employees have at least 0 years of tenure and all are active).
In Year 1, the denominator would shrink to 85 (as the 15 with 0 years of tenure have not yet made it there), and the numerator would be the number of employees that terminated with < 1 year of tenure. The value I would want to show would be 1-x (x being the new value above).
In Year 2, the denominator would shrink to 70 (as an additional 15 have not made it to 2 years yet), and the numerator would be the number of employees that terminated with between 1 and 2 years of tenure. The value would then be the value from year 1 minus this new value, and so on.
Here it is in Excel (visual & then the calculations). I want to do this as a DOMO bar chart, but I need to code the beastmode(s) for Years 1, 2, 3, 4 and 5. I don't know if I can do that as one statement, or 5 separate statements, and I'm also not sure how to get them to show on the graph where the x-axis is static showing those 5 years (with 1-yr increments):
Where Tenure = completed years with the company
No Departed = the count of those terminated with tenure > the previous tenure bin and < = the current tenure bin.
Active = those in the company (active or terminated) that successfully made it to that next tenure threshold.
Non-Survival = No Departed / Active
Survival = 1 - Non-Survival
Survival P = Previous Survival P * Survival
0 -
In this case, there are 2 possible paths you can take:
If drill down capabilities are needed, then you'll need to transform your data using ETL and create a beast mode that have copies of each persons entries for each tenure year in which they should be counted for. This required a bit more set up but reduces the number of BeastModes you'll need to achieved the desired result.
If drilling is not needed, then you can create several BeastModes, one per year and metric you want to achieve, as you'd need the same row of data in your dataset to be counted towards multiple aggregations. As an example, for Tenure X you'd have:
Active (Basically will only return the ID if that employee had been active for at least X years)
CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) THEN `ID` END
Terminated in period (those that made it to the tenure, but hae been terminated before reaching the next)
CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) AND `Last Hire Date` + INTERVAL (X + 1) YEAR >= `Termination Date` THEN `ID` END
Then your Survival % for X can be calculated as:
1-COUNT(CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) AND `Last Hire Date` + INTERVAL (X + 1) YEAR >= `Termination Date` THEN `ID` END) / COUNT(CASE WHEN `Last Hire Date` + INTERVAL X YEAR <= COALESCE(`Termination Date`, CURDATE()) THEN `ID` END)
If you only care about the % you can just create 5 of the last formulas and call it a day. Using the following as an example:
Then the calculation above will look as follows on a table (did up to term 3, to demonstrate)
1 -
Thank you greatly, I think I have what I need now. This is perfect! I see you did the final visual in a table, can those individual beast modes (if I go that direction) be made into a visual chart (rather than just a table) that shows the survival % beast mode?
0 -
Is doable, but it'll be tricky and depending on the card type used it might not be possible to get exactly what you need. The closer I could think of that would yield a similar result is the grouped lolipop chart or grouped bar chart, using a constant as your X axis. It'd look something like one of the following:
0
Categories
- All Categories
- 1.9K Product Ideas
- 1.9K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 302 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 104 SQL DataFlows
- 637 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 761 Beast Mode
- 65 App Studio
- 42 Variables
- 704 Automate
- 182 Apps
- 458 APIs & Domo Developer
- 53 Workflows
- 11 DomoAI
- 39 Predict
- 16 Jupyter Workspaces
- 23 R & Python Tiles
- 401 Distribute
- 116 Domo Everywhere
- 277 Scheduled Reports
- 8 Software Integrations
- 132 Manage
- 129 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 111 Community Announcements
- 4.8K Archive