Combine DataSets to a Filtered Count/Aggregated DataSet with different colums
TLDR: This is a complicated dataset (for me) that I am trying to build. Basically, I am taking data from two datasets, with different columns, aggregating them to columns that is listed correctly in rows... If that makes sense. I need help going about it the right way.
Enviroment:
Below is sample data I have and what I am trying to build from it... so you have an idea of what I am working with.
Input DataSet 1:
DateA: | DateB: | Employee: | Sales Stage: |
3/3/2018 | 3/5/2018 | John Doe | Closed Won |
3/25/2018 | 3/27/2018 | Jayne Doe | Closed Lost |
3/16/2018 | 3/18/2018 | John Doe | Closed Won |
4/1/2018 | 4/5/2018 | John Doe | Closed Won |
4/15/2018 | 4/17/2018 | Jayne Doe | Closed Lost |
4/20/2018 | 4/21/2018 | Jayne Doe | Closed Won |
Input DataSet 2:
DateC: | Employee: | Comm Structure: | Sales: |
3/31/2018 | John Doe | Consultant | 100 |
3/31/2018 | Jayne Doe | Coach | 200 |
4/30/2018 | John Doe | Consultant | 300 |
4/30/2018 | Jayne Doe | Coach | 350 |
Please note that the 'Input DataSet 2' entires listed are monthly summery stats. Each employee will have one entry for the month (the date is just listed as the 'last day of the month').
Output DataSet (What I am trying to achieve):
Date: | Employee: | Comm Structure: | Sales: | Total Ops: | Ops Won: | Avg Op Sales: | Avg Op Sales Won: |
3-2018 | John Doe | Consultant | 100 | 2 | 2 | 50 | 50 |
3-2018 | Jayne Doe | Coach | 200 | 1 | 0 | 200 | 0 |
4-2018 | John Doe | Consultant | 300 | 1 | 1 | 300 | 300 |
4-2018 | Jayne Doe | Coach | 350 | 2 | 1 | 175 | 350 |
What I currently have setup is using a dataflow to append the rows together with all columns from both tables. I then have calculated field(s) via 'Beast Mode' to 'Count' the 'Closed Lost' and 'Closed Won' sales stages. I am having a hard time using the count function by grouping the months together, but also inserting it on the same row as the summery entry for that specific month and employee. Below is what I currently got:
- CASE WHEN DATE_FORMAT(`DateB`,'%M %Y') = DATE_FORMAT(`DateC`,'%M %Y') THEN
COUNT(`Sales Stage`)
END
Once I get the counts inserted correctly in the row, I can get the averages from there.
Question:
Am I going about this all wrong? I have been working on this for a couple days, but I am getting no where. Should I be filtering/agregatting the data with the options availible in the dataflow first? Any direction on how I may accomplish this would be appreciated.
Best Answer
-
You can accomplish this using a SQL Dataflow with two steps:
1) Summarize the Won/Lost counts as column values at a monthly level by Employee
2) JOIN the Summary data to the Sales data on the Employee and Date.
(These steps are combined in the statement below)
- -- Aggregate all to Monthly Level and Pivot out Won and Lost columns
SELECT
last_day(a.`dateb`) as `EndofMonth`, -- attribute to last day of month
a.`Employee`,
b.`Comm Structure`,
b.`Sales`,
count(case when `Sales Stage` = 'Closed Won' THEN a.`Employee` END) as `Won`,
count(case when `Sales Stage` = 'Closed Lost' THEN a.`Employee` END) as `Lost`
FROM dataset1 a
JOIN dataset2 b on a.`Employee` = b.`Employee` and last_day(a.`dateb`) = b.`datec`
GROUP BY last_day(a.`dateb`),
a.`Employee`,
b.`Comm Structure`,
b.`Sales`
This assumes you have both Won/Lost entries and Sales in the same month for all employees.
It could also be done in Magic ETL if you aren't familiar with SQL syntax, but it would take more steps to accomplish.
Jacob Folsom
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 - -- Aggregate all to Monthly Level and Pivot out Won and Lost columns
Answers
-
I am happy to help with this and can see how it's been a challenge. Can you clarify for me on the top row of dataset1, what is DateA vs DateB and how do they relate to the Sales Stage...
DateA: DateB: Employee: Sales Stage: 3/3/2018 3/5/2018 John Doe Closed Won I want to make sure I understand how to attribute the Ops Won to a single date column.
Thanks,
Jacob Folsom
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2 -
Hi DataJake,
DateA: Is the date the lead was obtained, or when the opportunity was created
DateB: Is the date when the employee determined the opportunity was either 'closed' or 'won'
To my understanding, DateA is irrelevant for what I am trying to accomplish. I was trying to take DateB, and check if it occurs within the same month/year of DateC.... and then somehow, count all those that occurred within that month/year for that particular employee.
Thanks though for taking a look at this.
0 -
Ok, great. I understand what you are trying to do.. So let's remove DateA from the equation.
Solution 1 - Append Datasets
In your Magic ETL job where you append data, rename your Date Columns so they have the same name, this will allow you to retain the granularity of the closed/won activity and also the monthly sales details.
DateB = Date
DateC = Date
Your output should look like this:
Date Employee Comm Structure Sales Sales Stage 3/31/18 John Doe Consultant 100 3/31/18 Jayne Doe Coach 200 4/30/18 John Doe Consultant 300 4/30/18 Jayne Doe Coach 350 3/5/18 John Doe Closed Won 3/27/18 Jayne Doe Closed Lost 3/18/18 John Doe Closed Won 4/5/18 John Doe Closed Won 4/17/18 Jayne Doe Closed Lost 4/21/18 Jayne Doe Closed Won LIMITATION: this will still not apply the comm structure to the Employees in the Sales Stage Metrics. You will need a join for that instead of the append.
BEASTMODE: Closed Lost Count
COUNT(CASE WHEN `Sales Stage` = 'Closed Lost' THEN `Employee` END)
Consideration
Does the Comm struture for a given employee change from month to month? If not, then you could have a reference table that you join the appended data to in order to have it apply to all employees. If it can change from month to month, then you will need to join on Employee + End of Month column, which would require a little more logic in your dataflow.
Here is a sample workflow for the Select Columns (Rename) and Append, folllowed up with a branch that gets the Comm Structure per employee (with Remove Duplicates so there is one row per employee)
This now will have a comm structure reference for all rows of an employee.
Does this help?
Jacob Folsom
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2 -
Hello DataJake,
Thank you for the through response. To follow up, I replicated what you provided, and I think we are closer to what I am trying to achieve, but not quite there yet. The issue I have currently, is similiar to what I had before my OP.
In our example, the beast mode 'Closed Lost Count' does count all occurances for each row in the table, however, I am trying to insert the sum total of these "counts" in the same row(s) as the 'Sales' data (Or at least, that was my original plan). Then my plan was to do a quick beast mode average calculation for those rows (`Sales` / (Sum of total count for each employee/month/year)).
Currently my output dataset with BM column you provided looks similiar to below:
Date Employee Comm Structure Sales Sales Stage BM: SS Counts 3/31/18 John Doe Consultant 100 0 3/31/18 Jayne Doe Coach 200 0 4/30/18 John Doe Consultant 300 0 4/30/18 Jayne Doe Coach 350 0 3/5/18 John Doe Consultant Closed Won 1 3/27/18 Jayne Doe Coach Closed Lost 1 3/18/18 John Doe Consultant Closed Won 1 4/5/18 John Doe Consultant Closed Won 1 4/17/18 Jayne Doe Coach Closed Lost 1 4/21/18 Jayne Doe Coach Closed Won 1 I was attempting to do something similiar to below:
Date Employee Comm Structure Sales Sales Stage BM: SS Count BM: SS Sum BM: Avg Sale 3/31/18 John Doe Consultant 100 0 2 50 3/31/18 Jayne Doe Coach 200 0 1 200 4/30/18 John Doe Consultant 300 0 1 300 4/30/18 Jayne Doe Coach 350 0 2 175 3/5/18 John Doe Consultant Closed Won 1 3/27/18 Jayne Doe Coach Closed Lost 1 3/18/18 John Doe Consultant Closed Won 1 4/5/18 John Doe Consultant Closed Won 1 4/17/18 Jayne Doe Coach Closed Lost 1 4/21/18 Jayne Doe Coach Closed Won 1 The end goal here is to have a card display a 'grouped bar graph', grouped by employee of their average sales per month.
So my question is, is it possible to take these counts from the various rows and insert their sums into the same rows as the 'Sales'? Or is there another way to accoplish my end goal?
Thanks again for your help so far.
0 -
You can accomplish this using a SQL Dataflow with two steps:
1) Summarize the Won/Lost counts as column values at a monthly level by Employee
2) JOIN the Summary data to the Sales data on the Employee and Date.
(These steps are combined in the statement below)
- -- Aggregate all to Monthly Level and Pivot out Won and Lost columns
SELECT
last_day(a.`dateb`) as `EndofMonth`, -- attribute to last day of month
a.`Employee`,
b.`Comm Structure`,
b.`Sales`,
count(case when `Sales Stage` = 'Closed Won' THEN a.`Employee` END) as `Won`,
count(case when `Sales Stage` = 'Closed Lost' THEN a.`Employee` END) as `Lost`
FROM dataset1 a
JOIN dataset2 b on a.`Employee` = b.`Employee` and last_day(a.`dateb`) = b.`datec`
GROUP BY last_day(a.`dateb`),
a.`Employee`,
b.`Comm Structure`,
b.`Sales`
This assumes you have both Won/Lost entries and Sales in the same month for all employees.
It could also be done in Magic ETL if you aren't familiar with SQL syntax, but it would take more steps to accomplish.
Jacob Folsom
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1 - -- Aggregate all to Monthly Level and Pivot out Won and Lost columns
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 103 SQL DataFlows
- 627 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 755 Beast Mode
- 61 App Studio
- 41 Variables
- 693 Automate
- 178 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive