I am trying to do this with magic ETL but I am not having much luck.
I'm assuming by the latest row you mean the one with the most recent date and not the last row in your dataset.
Using Magic ETL - feed your dataset into an add constant tile, add a constant called "Join Column" with a value of 1, feed that into a group by tile and group by this new field selecting the Maximum date, then take this output and using a Join Tile do an inner join on your original dataset and the grouped dataset based on your Max date = date.
You can use the group by tile and then use batch_id and choose Max for your aggregate. Put date and user_id in your select list.
Thanks so much for your quick response, maybe I should have been slightly clearer in my question. My file has multiple duplicate rows with the same date, so for example 20/09/23 could appear 4 times. I would like to keep the row for each date that has the maximum batch_id.
You'd do the same thing above except use the batch id instead of the date to get the latest batch ID records.
Thanks both for your responses, you have been a great help.