Forward Filling using ETL/ Without Python tile

Options
Prathmesh24_Diacto
edited May 17 in Magic ETL

Hello, I am trying to work out a logic to forward fill values.

Below is an example of what I want to achieve:


I want to forward fill the Calculation column values and the grouping will be based on platform and code.

So, I want to populate last non null value instead of zero in the calculation column. So, the output then becomes.

The reason for doing this is when I take absolute average of the calculation column for let's say A platform and ABC code, I will get the correct answer. Currently what it does is from image 1 it sums calculation and divides it by number of rows irrespective if it is zero or not. I am aware that we can skip zero values while averaging but I don't want to remove those records instead fill it with last non-null values.

Thanks much in advance!

Best Answers

  • Sean_Tully
    Sean_Tully Contributor
    Answer ✓
    Options

    I think this would work conceptually:

    In ETL, filter out all the zeroes, so you only have the rows with the calculation values. Then, use a window function to lead the next date with a value, so that each row has the original date, the calc value, and the next date with a value. Join that back to the original input with the join being something like date > calc_value_date and date < next_value_date. You can then use a formula tile to replace the 0s with the value you just brought in on the join.

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    I've outlined an alternative method here:

    You'd need to use 0 instead of NULL when calculating your column field.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • trafalger
    Options

    You'll need an ETL, since you need to generate additional rows.

  • Prathmesh24_Diacto
    edited May 17
    Options

    Hello @trafalger , thank you for replying. I am not getting an idea as to how we can fill the zeros with last non null values based on platform and code grouping. I am not sure how it will keep looping

  • Sean_Tully
    Sean_Tully Contributor
    Answer ✓
    Options

    I think this would work conceptually:

    In ETL, filter out all the zeroes, so you only have the rows with the calculation values. Then, use a window function to lead the next date with a value, so that each row has the original date, the calc value, and the next date with a value. Join that back to the original input with the join being something like date > calc_value_date and date < next_value_date. You can then use a formula tile to replace the 0s with the value you just brought in on the join.

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    I've outlined an alternative method here:

    You'd need to use 0 instead of NULL when calculating your column field.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**