Join in ETL

caiorogano · January 30

Question:
I have two separate databases where there is only one common column, which is the "salesperson."

The first database contains sales values for each salesperson.
The second database contains the area of operation and products associated with each salesperson.

I plan to create a new column to calculate the value/area of operation ratio. The challenge is that when I filter out products, I want the sales value to remain constant, but the area of operation should decrease, thus increasing the value/area ratio.

How can I merge these two databases, ensuring that I can calculate the adjusted value/area ratio dynamically as I filter products, while keeping the value intact?

Any advice or step-by-step guidance would be greatly appreciated!

This version adds the specific calculation and filtering process you're aiming for.

Data_Devon · January 31

Hello caiorogano!

Why would the sales remain constant even if products are removed?

Also - what is your relationship between the datasets assuming that Salesperson is your Join Key? Is there one row per Salesperson, and values/area of operation/products are pivoted out to new rows?

I want to make sure I understand before I offer my two pennies. Perhaps a GOAT can read through the lines to help, but I don't want to lead you astray with insufficient information.

caiorogano · February 7

My ultimate goal is to calculate USD per meter, using the Salesperson field as the key. In one dataset, I have all the revenue (USD) per salesperson, and in another dataset, I have Salesperson, Area of operation (meters), and products sold for each of them. I need to filter products while keeping the revenue value constant because some products don’t make sense for certain salespeople in specific regions.

And to answer your question, I have one row per salesperson for each year in the revenue dataset.

DavidChurchman · February 10

Can you create a fake version of your two datasets with at least a few rows of what the data looks like? That will get you a fast, more reliable answer.

Based on you've said so far, I think what you'll need to do is pivot your Salesperson data long and join on the operation and product with the other table. To do the filtering you want, instead of having a table that's like this:

You want one that looks like this:

Then when you filter by area/product, you'll still be able to get totals by Salesperson. (Or filter by Salesperson and see all the associated products.)

Join in ETL

Comments

Categories