I'm looking for an efficient way to only keep contacts that appeared in the year 2018 and exclude contacts that appear in the year 2019 AND 2018. It's possible for a contact to appear several times in the same year, so I cannot simply use the Remove Duplicates tile. In Magic ETL, how would I do the equivalent of:
SELECT email_addr, year FROM Dataset1
where year = '2018' and email_addr not in
(
SELECT email_addr FROM Dataset1
where year = '2019'
);
My current method is using 1 Input DataSet, then using 2 filters. One for 2018 and the other for 2019. Then, I am using Join Data to do Left Outer. Next, I'm using a Filter to exclude any rows where the Year is null for the Datatset with a 2019 filter. It works, but is there a more efficient way to do this?