Marking Duplicates So I Can Filter Them Out

Good afternoon,

I am trying to mark rows that match up with each other as duplicates so that I can filter them out and only have the truly unique records left.

Example

image.png

So after I mark and filter the only record left will be the third one that doesn't have a match.

What is the best way to do this?

Thank you in advance

Best Answer

  • ColemenWilson
    Answer ✓

    Ah sorry I missed that you want both the original and the duplicate row gone when a duplicate exists. Interested in your use case - why you want both records gone when there is a duplicate and not just the duplicate? Is it that the duplicate transaction is to zero out the previous? If that is the case, and `retail` is the field that is being zero'd out then I agree with @ST_-Superman-_ 's approach. Here is how I did it using the data you provided with the correct output you are looking for:

    1. Overall ETL:

    Screenshot 2024-02-01 at 4.38.53 PM.png

    2. Group By step:

    Screenshot 2024-02-01 at 4.39.10 PM.png

    3. Filter Rows step

    Screenshot 2024-02-01 at 4.40.12 PM.png

    4. Output - Only unique rows

    Screenshot 2024-02-01 at 4.40.33 PM.png

    If I solved your problem, please select "yes" above

Answers