Hello, I have two data sets, that I ma trying to combine,
And
After I combine the sets, I get dupplicates
Any reccomendations?
It appears your join is only on the month. If you have multiple values in the supplier code codes it will cause a cross product join creating duplicates. Try and include the supplier code in addition to the month in your join.
@Newbie23 Sine you're just joining on month it will cause duplication. For each line in your Test QTY Rejected dataset, it will take the month and look for that month in Test Received QTY. Since that one has multiples of the same month, it will add a row for each time that month appears. You would need to add supplier code as another primary key in your join. That would work as long as your Test QTY Rejected has no duplicate months per supplier code. If it still doesn't work, then you need to do further groupings in your datasets so that each one is at a supplier-month grain (each month only appears once per supplier.)
How are you combining your data? Are you using a join in Magic ETL? What kind of join are you using? What are the values you're joining? It appears you may have an issue with how you're joining your two datasets together. Make sure you have all of the key fields in the join to avoid duplication of data.
Hi,
I am using Magic ETL, and using the outer left, since I only need one column from the QTY received. I did noticed one data set has move rows than the other, but I was thinking those will be blank and the data will aligned based on the month.
Thank you both for your help, I did the join by month and supplier and the data looks good, however, the new datasets only shows two suppliers and I have a lot more that that. When I filter by supplier trying to find the same supplier A58, it says not in preview or not rows matching the criteria.
Swap the two datasets you're using in your join. You're basing all of your dataset from the suppliers which have rejected records and not all vendors from the received records.
TEST Received Qty should be on the left of your join, not the right.
Are you suggesting this
It works, see below, however my supplier A58 is not there, I wondering if I need to do some special grouping for my suppliers, most of my suppliers start with a letter and I don't see any of them, only see supplier codes that are numeric only
The main dataset for QTY received has 32 clolumns and I am extracting 3 columns: Month, supplier code and Qty received
Where does your A58 supplier exist? In both input datasets or just one?
Exists in both.
The QTY received has ALL the suppliers (several bussines units), I want to extract the suppliers for my bussines unit only ( about 15 suppupliers). the QTY rejected is for my business unit only.
I am going to try to filter by supplier first then join my datasets.
I added more columns to my QTY received, like supplier name and company, and drop the supplier code. I will be sorting by supplier name instead of supplier code. It seams to be working fine, I will try it again tomorrow.
Thanks for your help,