It's easy to remove Duplicates.....How do I KEEP only the Duplicates in a Dataset?

I have a large dataset that I am constantly adding to, and the unique identifier for each item is a 17 or 22 character Text String.  It is critical for me to quickly identify if I have duplicated a previous item when I add to the dataset.

 

ETL makes it simple to remove duplicates from a dataset.....but is there a way to eliminate everything BUT the duplicates???  Ideally, I'd like to either:

 

1.  Create an alert anytime a new duplicate value is added to the dataset, or

 

2.  Create an Output Dataset that consists ONLY of the rows that have a duplicate value in a specific column.

 

Thanks in advance for any help.

Best Answer

Answers

  • Thank you......that worked perfectly, and your instructions were perfectly clear and easy to implement!

  • Worked a treat.  thanks!

  • I tried this but I keep getting duplicates.

    What Join did you use and what which columns from what dataset did you drop?

    My filter found no nulls.