Beast Mode

Question

Hey Guys,

Coming today with a beast mode question. See attached screenshot for reference. I have a dataset that looks like the attached and want to calculate the value of those ID's. Here is a catch, I want to count only EITHER unique ID's (those that do not have any Revisions), or the latest revision for a given quote. You can see that ID 107 is a third revision for quote 100. In that case I would only want to count ID 107, as that's the latest revision of that quote (ID's 100,103,106 will be dropped). Is that possible to calculate? Marked in green is what I want to have included.

Thanks

example.PNG

nmizzell · Accepted Answer

You need a few helper columns to acheive this. The first is the ID, the second is the revision number. assuming the revision number is in chronological order (highest number being the one you want to count) then you can add another column that finds the maximum revision number by ID. Lastly, create a function that reads the maximum revision number and the actual revision number, and produces a 0 if it is not the maximum, and a 1 if it is the maximum. Filter off anything that is not equal to the maximum id revision, then count distinct ID values.

This can be done in magic ETL and in Beastmode.

beast mode for finding the maximum revision number by id: max(`revision_number`) over(`id`)

beast mode for determining if the current revision number is equal to the max: case when `revision_number` = max(`revision_number`) over(`id`) then 1 else 0

filter off anything that is a 0 in your card

beastmode to count distinct IDs: count(distinct `id`)

This is just one of many ways to solve the problem.

marcel_luthi · Accepted Answer

Agree with @nmizzell, there are several ways to do this. The first thing I'd do is in ETL break the revision number into 2 separate columns, so I know which is the original vs the revisions. A formula tile with 2 formulas (for OriginalID and RevisionNum) would work.

OriginalID

CASE WHEN `Revision` IS NULL THEN `ID` ELSE SPLIT_PART(`Revision`,'-',1) END

RevisionNum

CASE WHEN `Revision` IS NULL THEN 0 ELSE SPLIT_PART(`Revision`,'-',2) END

You might need an ALTER COLUMN tile after the formula one, to make sure the revision number is treated as number and not text. This has the advantage that the columns will also be populated as revision 0 for the original entry, so you don't have nulls in your data and can save you lots of headaches later when trying to graph. Once you have this helper columns, I'll go with the suggested beast modes.