LEAD IGNORE NULL in Magic ETL?

Options
Robaba04
Robaba04 Member
edited September 2023 in Magic ETL

Hi there,


I am trying to rewrite a Redshift Dataflow into Magic ETL. I ran across a specific function that was available in Redshift that I dont think is possible in Magic ETL. I wanted to get the community's input and advice to see what you guys think.
This is the function found in Redshift:


LEAD(CASE WHEN t2."channel" NOT IN ('Operational Email','Direct') THEN COALESCE(t2."source",'blank') END, 1) IGNORE NULLS OVER (PARTITION BY t1."domo_id" ORDER BY t2."sessionid" DESC)


In Magic ETL, there is the Rank & Window tile that allows for the LEAD function, but it does not behave the same as the Redshift since the Redshift has the IGNORE NULL clause.

There are 20 cases of this LEAD IGNORE NULL function in the Redshift, so my first thought is that the method would be 20 filters + 20 Rank & Windows in Magic. I'm afraid that will affect performance for larger datasets.

Should I leave the Redshift Dataflow as Redshift?
What are some ways that you guys think I should handle this?

The current Redshift Dataflow runs in 4 hours. The purpose for moving into Magic, is the hope that it would reduce the run time significantly.

Appreciate any help!

Thank you in advance

Best Answer

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    There isn't a simple way to do it within Magic ETL but you could attempt to calculate a specific value change id.

    To do this you could use a formula tile to conditionally set a column value of 1 if the field is not blank or 0 if it is blank / null.

    Then using a rank and window tile do a running SUM of your column so that it will create an ID to increase when a non-null value is found. Make sure you're partitioning on the same field as you stated above.

    Then use a group by tile to group by your partition field and ID field within each partition to select the MAX value.

    Join this back to your original dataset based on your partition key and the ID field to show what the value is.

    You can then use this new fields as your value.

    Essentially we've done a forward-fill of the values to replace the nulls.

    Then you can feed it into a Rank and Window tile to do the LEAD on the values to get the next value.

    If you want to do a back-fill to take the next non-null value and replace all the prior nulls then you'd need to join it based on the ID being the ID-1 and then COALESCE your current value with the new value you calculated.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    There isn't a simple way to do it within Magic ETL but you could attempt to calculate a specific value change id.

    To do this you could use a formula tile to conditionally set a column value of 1 if the field is not blank or 0 if it is blank / null.

    Then using a rank and window tile do a running SUM of your column so that it will create an ID to increase when a non-null value is found. Make sure you're partitioning on the same field as you stated above.

    Then use a group by tile to group by your partition field and ID field within each partition to select the MAX value.

    Join this back to your original dataset based on your partition key and the ID field to show what the value is.

    You can then use this new fields as your value.

    Essentially we've done a forward-fill of the values to replace the nulls.

    Then you can feed it into a Rank and Window tile to do the LEAD on the values to get the next value.

    If you want to do a back-fill to take the next non-null value and replace all the prior nulls then you'd need to join it based on the ID being the ID-1 and then COALESCE your current value with the new value you calculated.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**