Using Magic ETL For One To Many

kivlind
kivlind Contributor

I have one dataset with many, many lines with a company number. I have another dataset with one line per company with plan values in it. I want to combine the two in Magic but don't want the Plan number to repeat over and over (flatten the plan?) in the combined dataset so that the summarized number per company number in the subsequent dataset equals the plan file with one row per company number.

 

Hopefully this make sense. I thought I was taught at one point how to do this in Magic, but drawing a blank.

 

Thanks,

DK

Comments

  • jhl
    jhl Member

    Hi DK,

     

    are the "many, many lines" duplicate values? If so, you might want to try the Remove Duplicates function in ETL. 

     

    If they are not: which values of the larger dataset do you want to keep? If it is, say, a dataset with a timestamp that gives you values for each company for each day, you can use Rank and Window to, say, give the most recent value a rank of 1 and then Filter Rows to only keep those. Join on company ID and you should have what you want.

     

    If you need aggregations, like the sum of all values for each company, I would suggest you use a MySQL dataflow with something like

    SELECT CompanyID, SUM(value) FROM dataset

    GROUP BY CompanyID;

     

    and then just join the oneline dataset to it using LEFT JOIN.

     

    HTH

  • Is anyone able to help out with this request?