Historic Timesheet Analysis

mhouston · 2022-04-21T19:01:25+00:00

There was an error rendering this rich post.

I'm working on an analysis of our historic timesheet data and trying to calculate our employee utilization (actual hours per day/8), so that we have a historic benchmark and can compare to projections for future utilization.

The challenge I'm running into is actually cleaning the data. It's generally well-structured - in theory, there should be a single row per person per project per day (some individuals may charge time to multiple projects). However, there are instances where adjustments are made - ex: someone charged 1 week of time to the wrong project (project A). So then accounting adds an entry for negative hours to project A, and then positive hours to Project B in the next pay period. Example:

However, the adjustments are not always consistent in how and when they are done (and there's no code to distinguish between an adjustment entry and a regular entry).

The projected utilization is calculated on a per day basis, but trying to do that with our historic data means there are some weird spikes because of the adjustments (one case was someone charged 38 hours on a day one week, and then there is an entry for -38 on a day in the next week).

Does anyone have any ideas on how to approach this? I know that I can manually clean this but I'm trying to avoid that so that when the timesheet data is processed weekly, there isn't manual maintenance to keep a card accurate.

Find more posts tagged with

Dataset

Magic ETL

Accepted answers

MarkSnodgrass

This is a tough one as I don't know that there is a clear cut solution since this is around data cleaning. Here are a few ideas for you, and maybe one of these will work for you:

Use recursive dataflows to identify adjustments - If you use a recursive dataflow, you will be able to use the batch_date as an additional timestamp field to help you identify new entries if you looked at entries by user, date and project. This won't help you on data that is currently already in there, but could help you going forward.
Look for negative entries or entries greater than 8 hours - You could use a filter to pull these rows and then do some additional work to determine how many days you need to spread this out to so that you are creating adjusting entries for each day rather than a bulk entry. Once you have them smoothed out, you can append it back to the main data.
Train accounting - This probably goes without saying, but let accounting know the issue that they are causing by doing a single bulk entry and ask them to make the correct entry for each day and date it accordingly.

Hope this helps.

All comments

MarkSnodgrass

This is a tough one as I don't know that there is a clear cut solution since this is around data cleaning. Here are a few ideas for you, and maybe one of these will work for you:

Use recursive dataflows to identify adjustments - If you use a recursive dataflow, you will be able to use the batch_date as an additional timestamp field to help you identify new entries if you looked at entries by user, date and project. This won't help you on data that is currently already in there, but could help you going forward.
Look for negative entries or entries greater than 8 hours - You could use a filter to pull these rows and then do some additional work to determine how many days you need to spread this out to so that you are creating adjusting entries for each day rather than a bulk entry. Once you have them smoothed out, you can append it back to the main data.
Train accounting - This probably goes without saying, but let accounting know the issue that they are causing by doing a single bulk entry and ask them to make the correct entry for each day and date it accordingly.

Hope this helps.

mhouston

@MarkSnodgrass thanks for the ideas! Your point about training accounting is a good one and this is probably a good opportunity to revisit the time entry business process and see if we can put better controls in place.

jaeW_at_Onyx

@mhouston you probably can't automate allocation.

you could try building buisness logic.

so like if the amount posted is negative then just subtract it from previous entries (so we operate on the assumption that we aren't back posting two or three week ago adjustments).

this might help you smooth out the negative spikes ... but also assumes that in your -40 hours example that we need to adjust out the 22nd, 21st etc.

alternatively you could use Domo as a business process for capturing the why adjustments were made if your ERP system doesn't support capturing 'reason codes'

so you could use a webform (or form builder or any other data entry tool) that requires users to justify why a negatvie entry was made (alerts). ie you'd need an "applies-to" field to say -40 hours applies to entry 36, 37, 39

THEN run an ETL that produces an 'adjusted dataset' where you GROUP BY entry_id to apply the allocation.

mhouston

@jaeW_at_Onyx how would you retain the information of which project the hours go to? like if hours are moved from A->B or from a project to overhead (Which impacts the utilization calc?) If I do a group by day to sum the hours per day, I lose that info, and then the utilization calc is wrong.

We also definitely have adjustments going back more than a week... we had an example where someone's hours for january were all wrong, so then there's a lump sum adjustment in february to move all of those.

Our ERP has a place for comments in timesheet so that could potentially be where we capture it, but we definitely don't have the business process established for that yet.