Flag for duplicate record

I have two datasets, same structure/columns, 1 large dataset with all records and another large dataset with suspected duplicate customers removed. I want to join these together and create a flag for the suspected duplicate records.

I tried to add a formula tile to each dataset, Duplicate = Yes ( for dataset filtered dupes out) and Duplicate = No for the other. I tried appending these records and just got null values.

Any suggestions here?

Answers

  • @renee12345 Can you please share some screenshots from your dataflow and where you are seeing the null values?

  • How did you do the appending?

    Did you do a left join from your all records to your duplicate remove records dataset based on the primary identifiers between the two tables? If the identifier isn't found in the removed records dataset then you can use a formula tile to calculate it's a duplicate

    CASE WHEN `id field from removed dataset` IS NULL THEN 'Duplicate' ELSE 'Not Duplicate' END
    

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • helpme12345
    helpme12345 Member
    edited May 2023

    Hi @GrantSmith

    I did what you proposed, but when I use that case statement in my beastmode for a line graph by week, the dates are duplicated on the graph?

    Any thoughts on what could trigger this?

  • Are you grouping by anything? Have you selected to graph by day or month in your date selector (upper right of the graph)?

    What type of graph are you using?

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**