Magic ETL: How to REMOVE Rows Meeting a Two-Column Criteria?

As part of my ETL, I'm trying to REMOVE rows that meet the following criteria:

column 1 is NOT NULL

AND

column 2 is NULL

So if column 1 has any value in it AND column 2 is empty, then remove those rows from the data set.

I was trying to figure out how to create a Filter Formula for this, but it seems to be designed for INCLUDING rows that meet criteria rather than REMOVING them.

Tagged:

Best Answer

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    Think of a filter tile as trying to selected the rows that match a certain criteria. If you're using a filter formula you can invert the logic. So if you know the logic you want to remove add a NOT in front of everything encapsulated in parenthesis.

    NOT (`column 1` IS NOT NULL AND `column 2` IS NULL)
    

    If you take the inverse of that logic you'd take out your NOT NULL and replace with NULL and replace NULL with NOT NULL and also convert AND conditions to OR condition and OR conditions to AND conditions.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • RobSomers
    RobSomers Coach
    edited February 2023

    @pstrauss You would just do the opposite of what you're wanting to remove:

    column 1 is null and column 2 is not null

    **Was this post helpful? Click Agree or Like below**

    **Did this solve your problem? Accept it as a solution!**

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    Think of a filter tile as trying to selected the rows that match a certain criteria. If you're using a filter formula you can invert the logic. So if you know the logic you want to remove add a NOT in front of everything encapsulated in parenthesis.

    NOT (`column 1` IS NOT NULL AND `column 2` IS NULL)
    

    If you take the inverse of that logic you'd take out your NOT NULL and replace with NULL and replace NULL with NOT NULL and also convert AND conditions to OR condition and OR conditions to AND conditions.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Thanks @GrantSmith and @RobSomers that's very helpful!

  • pstrauss
    pstrauss Member
    edited February 2023

    On a related note, what's the most reliable way to filter out rows with specific values in them but not null rows?

    For example, I want to remove rows where column_1="A" OR column_1="B" but not where column 1 is NULL?

    I originally tried two filters with an ALL setting where column_1<>"A" or column_1<>"B" but it seems to remove NULLs too when I do that.

    I think it's the inverted logic that is throwing me off.

  • @pstrauss Nulls don't play well logical operators. Null doesn't mean there's no value, it means the value is unknown. Is the null in column_1 equal to A? SQL doesn't know. It might or it might not, so SQL treats at as meeting the condition. That's a simplified answer, but this gives a more in depth explanation:

    So you will either need to include IS NOT NULL or IS NULL in your filter rules or formula to account for NULLS or use IFNULL() in your filter to assign a value to the nulls that can be used in the comparison.

    **Was this post helpful? Click Agree or Like below**

    **Did this solve your problem? Accept it as a solution!**