I have 2 datasets:
- Contains individual transactions with an employee's name
Employee Name | Sale Price | Sale Date |
John Doe | 100 | 11/2/20 |
John Doe | 250 | 11/6/20 |
Jane Doe | 500 | 11/8/20 |
Jane Doe | 100 | 11/12/20 |
- Contains each employee and their number of consultations and new clients
Employee Name | Start Date | End Date | Consultations | New Clients |
John Doe | 11/1/20 | 11/7/20 | 3 | 1 |
Jane Doe | 11/1/20 | 11/7/20 | 2 | 2 |
John Doe | 11/8/20 | 11/14/20 | 4 | 3 |
Jane Doe | 11/8/20 | 11/14/20 | 5 | 4 |
I need to be able to filter both data sets such that dataset a) contains a single row for each employee with their total sales within a given date range and dataset b) contains a single row for each employee with their total consultations and new clients. Once I have these datasets, I will need to join them on the employee's name to create a single dataset containing total sales, consultations and new clients.
Employee Name | Consultations | Clients | Sale Price |
John Doe | 7 | 4 | 350 |
Jane Doe | 7 | 6 | 600 |
I was able to accomplish this, but I run into an issue where I have to do the filtering within the ETL, which forces me to edit the ETL each time I want to query a new date range. If I try to do the filtering in a card and remove the totals, I end up with duplicated data on that final table join.
Employee Name | Start Date | End Date | Consultations | New Clients | Sale Date | Sale Price |
John Doe | 11/1/20 | 11/7/20 | 3 | 1 | 11/2/20 | 100 |
John Doe | 11/1/20 | 11/7/20 | 3 | 1 | 11/6/20 | 250 |
Jane Doe | 11/1/20 | 11/7/20 | 2 | 2 | 11/8/20 | 500 |
Jane Doe | 11/1/20 | 11/7/20 | 2 | 2 | 11/12/20 | 100 |
John Doe | 11/8/20 | 11/14/20 | 4 | 3 | 11/2/20 | 100 |
John Doe | 11/8/20 | 11/14/20 | 4 | 3 | 11/6/20 | 250 |
Jane Doe | 11/8/20 | 11/14/20 | 5 | 4 | 11/8/20 | 500 |
Jane Doe | 11/8/20 | 11/14/20 | 5 | 4 | 11/12/20 | 100 |
Is there some way to make this work such that I can query for a date range after running the ETL and my data isn't duplicated?