Data Join Confusion - Magic ETL

Hi,

I'm trying to join two sets of data together (assignments submission data & assignment summary data)

For the output, I am trying to get a list of all assignments with a due date in a date range that have not been submitted by the students.

The assignment submission data has a row for every student that lists every assignment that student has access to and provides data on when that student submitted that specific assignment for any individual row

The assignment summary data has a row for every individual assignment with the assignment's information (assignment name, due date, etc).

I have my two data sets and I am selecting which columns I want to pull from each

1: Assignment Submissions: DropboxID (to identify the specific assignment), OrgUnitID (to identify the course the assignment is in), SubmitterID (to identify the student), LastSubmissionDate, and CompletionDate

2: Assignment Summary: DropboxID (to identify the specific assignment), OrgUnitID (to identify the course the assignment is in), StartDate, EndDate, DueDate, Name

I'm trying to join these two data sets and then start with 1 filter that will allow me to validate that my returned data is what I am expecting: Filter based on OrgUnitID to get all assignments from a specific course.

As of right now, my results come back with 0 rows or 349 rows, neither of which make sense to me. The course that I am validating the data with has 30 assignments and 88 students. I would expect a data returned of 2,640 rows (one row for every student and assignment in the class - one student would be listed 30 different times due to having access to all assignments)

I've tried every type of join available in DOMO Magic ETL to try and figure out why the output is not what I expect, but I can't seem to figure it out. Here is a screenshot of my current ETL data flow and a screenshot of my current Join configuration.

Any help would be greatly appreciated!

Answers

  • Hey Jason,

    A few things I would confirm:

    • Do your input dataset contain the data and structure you are expecting? Can you spot check a few values that you would expect to see in the join and make sure the key matches in both datasets?
    • Is your filter tile doing what you think it is? Do you see your target rows if you remove it?
  • Yes, the data sets contain the data and structure I am expecting. I'm very familiar with these data sets.

    The filter is simply to limit the output to a specific course so that I don't receive 10s of thousands of rows of data. I can confirm that the filter is doing what I want. The filter works perfectly when used with a single data set. The joining of data sets is what keeps returning results that don't match what is really in the system.

  • Jones01
    Jones01 Contributor
    edited February 18

    I would remove the UTC tile and the filter tile.

    do the rows then appear in the final output?

    if not. Filter each set at the begining before the Select tile by the one orgunitid. Do you see 30 rows for 1 set and 2640 for the other in the preview?

    if you do then get a distinct list of dropboxids from the submission set. Are there 30? Do they match the 30 in the summary set?