Keeping only the latest date for unique ID

Options

Hi Guys,

I have a question related to ETL transformation. I have a dataset including unique ID's and dates assigned to it. In many cases, there is multiple dates associated with the same unique ID. In my case I want to keep only the latest date associated with each ID. For example I have ID 2 with a date 10.18.2023. If my dataset updates in 2 days and I get the a new entry with ID 2 and date 10.20.2023, I want to have the entry with 10.18.2023 removed and only keep 10.20.2023 for the ID 2. Is it just remove duplicates? I'm not sure how the data would refresh when a new data entry appears in the dataset.

Thoughts?

Thanks

Best Answers

Answers