Joining or Appending Data Help

user000253
user000253 Member
edited March 2021 in Magic ETL

I have a dataset (dataset 1) with individual rows as a specific job (with an identifier key) with details including year and country of operations. I have created cards to show the total number of jobs per year. I have another dataset (dataset 2) which shows historical revenue by Year and Country but it is not specific to a job (or identifier key). I am wanting to show revenue per year on the same card as the total number of jobs per year, so we can see the trend between the number of jobs and revenue. I've tried to use Magic ETL to "Join" or "Append" data from both datasets but without any good results. I guess the link or common column between the datasets is YEAR but when I merge these columns together the revenue per year is duplicated for everytime that YEAR is mentioned in dataset 1. For example for the whole year of 2009 in the UK, the revenue was $1000, it can't duplicated if there are more jobs in 2009 and in the UK.

Example:

Dataset 1

Job #123 United Kingdom 2009

Job #234 United Kingdom 2009

Job #345 United States 2010

Job #456 United Kingdom 2010


Dataset 2

United States 2009 $1000

United States 2010 $2000

United Kingdom 2009 $1000

United Kingdom 2010 $3000


How can I combine these datasets together to show Number of Jobs per year (bar chart) + Revenue per Year (Line)?


Thank you

Tagged:

Best Answer

  • bdavis
    bdavis Contributor
    Answer ✓

    If you left join the JOBS set with REVENUE set on location and year, you will end up with a column that shows the same revenue for every row in that location and year. You may be able to use this to show what you want if you want to keep the details about each individual job. If you're only looking for the number of jobs per year per location along with the revenue for the year and location, you can aggregate your JOBS set by grouping on location and year and adding a COUNT of the number of rows per group. Then, take the aggregated JOBS set, left join with REVENUE on location and year and you should end up with something like:

    LOCATION | YEAR | REVENUE

Answers

  • Hi @user000253

    I'd recommend using an APPEND tile in an ETL or a DataFusion to UNION your two datasets together. However one caveat to this would be they need to be in the same format / have the same columns

    You could process your data to look like the following:

    | Job ID | Country | Year | Amount |

    | 123 | UK | 2009 | NULL |

    | 234 | UK | 2009 | NULL |

    | NULL | UK | 2009 | 1000 |


    Then in your card you can define two different beast modes:

    Job Count:

    COUNT(`Job ID`)
    

    Revenue:

    SUM(`Amount`)
    

    Then use those two beast modes for your two different values (line and bar)

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Hi @GrantSmith thank you this is very helpful. By process my data do you mean to do this offline as in Excel? Or is there a function in DOMO that I can add columns and make both datasets have the same? As it stands Dataset 1 has 20 columns and dataset 2 only 3 (year, country, amount). Thanks

  • bdavis
    bdavis Contributor
    Answer ✓

    If you left join the JOBS set with REVENUE set on location and year, you will end up with a column that shows the same revenue for every row in that location and year. You may be able to use this to show what you want if you want to keep the details about each individual job. If you're only looking for the number of jobs per year per location along with the revenue for the year and location, you can aggregate your JOBS set by grouping on location and year and adding a COUNT of the number of rows per group. Then, take the aggregated JOBS set, left join with REVENUE on location and year and you should end up with something like:

    LOCATION | YEAR | REVENUE