Label ranked rows based on consecutive condition
Hi Folks
I'm looking to label multiple rows but only when they consecutively reach 8 or more Rank criteria.
I'm ordering the rank on the date, (magic ETL) and partitioning by rule2_hours. When the Rank reaches 8 or more, I want each row leading up to the highest number to be labeled - see the colour field.
All feedback appreciated
Kind regards,
Best Answer
-
@NateBI I was able to accomplish what you are looking to do using a Redshift Dataflow. If you don't currently have Redshift dataflows enabled you can contact your CSM, or you could attempt this same logic in Magic.
First, I created a new field that gives the order of the data, similar to rank but for every row in sequential order.
Second, I used this case statement in the Dataflow:
CASE
WHEN "rank" = 1 AND lead("rank", 7) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 2 AND lead("rank", 6) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 3 AND lead("rank", 5) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 4 AND lead("rank", 4) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 5 AND lead("rank", 3) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 6 AND lead("rank", 2) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 7 AND lead("rank", 1) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" >= 8 THEN 'Colour' ELSE 'No Colour'
END as "testing"And it worked! See results below:
This was fun, thanks for the challenge. If this solved your problem please accept this as the answer, if not let me know and I'll attempt the same thing in Magic ETL.If I solved your problem, please select "yes" above
2
Answers
-
@NateBI I was able to accomplish what you are looking to do using a Redshift Dataflow. If you don't currently have Redshift dataflows enabled you can contact your CSM, or you could attempt this same logic in Magic.
First, I created a new field that gives the order of the data, similar to rank but for every row in sequential order.
Second, I used this case statement in the Dataflow:
CASE
WHEN "rank" = 1 AND lead("rank", 7) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 2 AND lead("rank", 6) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 3 AND lead("rank", 5) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 4 AND lead("rank", 4) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 5 AND lead("rank", 3) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 6 AND lead("rank", 2) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" = 7 AND lead("rank", 1) over (partition by "rule2_hours" order by "order") = 8 THEN 'Colour'
WHEN "rank" >= 8 THEN 'Colour' ELSE 'No Colour'
END as "testing"And it worked! See results below:
This was fun, thanks for the challenge. If this solved your problem please accept this as the answer, if not let me know and I'll attempt the same thing in Magic ETL.If I solved your problem, please select "yes" above
2 -
@MichelleH thank you, this would be plan B since it would only highlight 8 and above but I need them all to be labeled. @colemenwilson I figured that someone would enjoy it! 😆 This looks right, I'll keep it in Reedshift - in Magic ETL that's 7 R&W tiles, plus Join tiles, per scenario and I have another two to sort like this. Thanks both
0 -
@MichelleH @colemenwilson
Issue Update: My rank field is not skipping nulls or restarting the count - this is impacting the logic you've sent. Here are the settings:0 -
You'll want both rank and order fields, so keep your rank field as it was before. For order, you can use any number of field combinations to determine the order. I am not sure what other fields you have available but I would think some combination of Date, Batch ID and/or Batch last run date should get the order right. You'll probably also want to remove the partition option. How are you ordering your data currently?
If I solved your problem, please select "yes" above
0 -
@colemenwilson The order field is working fine, it's based on the date. It's the Rank field that wasn't set up correctly to match the dummy table I sent in the first question. So now, I'm going back a stage to replicate the rank field - if possible - to support the final stage - your solution - of running a lead over each number.
0 -
Gotcha, I thought the rank was already solved for. Okay, so the results you are getting are consistent with what I would expect to see based on how your ETL is setup. To have the rank start over after a null there needs to be some other field the rank is considering, if it is just referencing date and rule2_hours_marker then the rank will continue after nulls. Is there some other field you can use to group a streak together? For example, a batch, session, team, etc…?
If I solved your problem, please select "yes" above
0 -
@colemenwilson Sweet, spoken with @GrantSmith and he suggested:
step one: cumulative sum to calculate the group ID
step_two: rank partitioned by step_one field
Which landed this here:I then run a lead over step_two to get the previous value and where previous_value_step_two = step_two then then 0 else step_two.
'step_three' now represents the original 'Rank' field concept.
No onto your solution for the colour marker.0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 738 Beast Mode
- 57 App Studio
- 40 Variables
- 685 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 47 Workflows
- 10 DomoAI
- 36 Predict
- 15 Jupyter Workspaces
- 21 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 124 Manage
- 121 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive