Beast Mode

Options

Hey Guys,

Coming today with a beast mode question. See attached screenshot for reference. I have a dataset that looks like the attached and want to calculate the value of those ID's. Here is a catch, I want to count only EITHER unique ID's (those that do not have any Revisions), or the latest revision for a given quote. You can see that ID 107 is a third revision for quote 100. In that case I would only want to count ID 107, as that's the latest revision of that quote (ID's 100,103,106 will be dropped). Is that possible to calculate? Marked in green is what I want to have included.

Thanks

Best Answers

  • nmizzell
    nmizzell Contributor
    Answer ✓
    Options

    You need a few helper columns to acheive this. The first is the ID, the second is the revision number. assuming the revision number is in chronological order (highest number being the one you want to count) then you can add another column that finds the maximum revision number by ID. Lastly, create a function that reads the maximum revision number and the actual revision number, and produces a 0 if it is not the maximum, and a 1 if it is the maximum. Filter off anything that is not equal to the maximum id revision, then count distinct ID values.

    This can be done in magic ETL and in Beastmode.

    beast mode for finding the maximum revision number by id: max(`revision_number`) over(`id`)

    beast mode for determining if the current revision number is equal to the max: case when `revision_number` = max(`revision_number`) over(`id`) then 1 else 0

    filter off anything that is a 0 in your card

    beastmode to count distinct IDs: count(distinct `id`)

    This is just one of many ways to solve the problem.

  • marcel_luthi
    marcel_luthi Coach
    edited December 2023 Answer ✓
    Options

    Agree with @nmizzell, there are several ways to do this. The first thing I'd do is in ETL break the revision number into 2 separate columns, so I know which is the original vs the revisions. A formula tile with 2 formulas (for OriginalID and RevisionNum) would work.

    OriginalID

    CASE WHEN `Revision` IS NULL THEN `ID` ELSE SPLIT_PART(`Revision`,'-',1) END
    

    RevisionNum

    CASE WHEN `Revision` IS NULL THEN 0 ELSE SPLIT_PART(`Revision`,'-',2) END
    

    You might need an ALTER COLUMN tile after the formula one, to make sure the revision number is treated as number and not text. This has the advantage that the columns will also be populated as revision 0 for the original entry, so you don't have nulls in your data and can save you lots of headaches later when trying to graph. Once you have this helper columns, I'll go with the suggested beast modes.

Answers

  • nmizzell
    nmizzell Contributor
    Answer ✓
    Options

    You need a few helper columns to acheive this. The first is the ID, the second is the revision number. assuming the revision number is in chronological order (highest number being the one you want to count) then you can add another column that finds the maximum revision number by ID. Lastly, create a function that reads the maximum revision number and the actual revision number, and produces a 0 if it is not the maximum, and a 1 if it is the maximum. Filter off anything that is not equal to the maximum id revision, then count distinct ID values.

    This can be done in magic ETL and in Beastmode.

    beast mode for finding the maximum revision number by id: max(`revision_number`) over(`id`)

    beast mode for determining if the current revision number is equal to the max: case when `revision_number` = max(`revision_number`) over(`id`) then 1 else 0

    filter off anything that is a 0 in your card

    beastmode to count distinct IDs: count(distinct `id`)

    This is just one of many ways to solve the problem.

  • marcel_luthi
    marcel_luthi Coach
    edited December 2023 Answer ✓
    Options

    Agree with @nmizzell, there are several ways to do this. The first thing I'd do is in ETL break the revision number into 2 separate columns, so I know which is the original vs the revisions. A formula tile with 2 formulas (for OriginalID and RevisionNum) would work.

    OriginalID

    CASE WHEN `Revision` IS NULL THEN `ID` ELSE SPLIT_PART(`Revision`,'-',1) END
    

    RevisionNum

    CASE WHEN `Revision` IS NULL THEN 0 ELSE SPLIT_PART(`Revision`,'-',2) END
    

    You might need an ALTER COLUMN tile after the formula one, to make sure the revision number is treated as number and not text. This has the advantage that the columns will also be populated as revision 0 for the original entry, so you don't have nulls in your data and can save you lots of headaches later when trying to graph. Once you have this helper columns, I'll go with the suggested beast modes.