Finding quartiles and outputting the data within each quartile
Hi
I'm trying to break a dataset down into quartiles for Recency, Frequency, and Monetary (each data item will have a value for each).
I then need to output the items that are within a given quartile in order to do further data manipulation.
I know that box plot graphs can show quartile values, but they don't actually mark up or export the data in a way it can be further used.
I've spent a number of hours looking through SQL forums as I couldn't find any answers here, but my SQL isn't great. The best of found is finding the median of a value https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Finding_the_Median_of_a_Value but then how do I go from there to find the quartiles?
Best Answer

artywah,
This sounds like something that you could do in mySQL or Magic ETL. I can think of a couple of ways you could do this:
 Median: You could find your median and then split the data so that the values below the median go into one transform and the ones above your median go into another (you would have to determine which half gets the actual median record). From each transform that has half of the data, you could repeat the same process, splitting the transform into two parts. This would result in 4 parts that represent your quartiles for the metric of interest (recency, frequency, or monetary).
 Ranking: You could rank your data by one of the metrics you referred to and then divide the data based on your total record count (the total would need to be added as a constant in its own column). Splitting up the data could be done with a case statement in SQL or the filter rows tile in Magic ETL. Here is an example of a case statement that could help get you to your end goal:
CASE
WHEN `rank_column_name` < (.25 * `total_row_count_column`) THEN 'Quartile 1'
WHEN `rank_column_name` >= (.25 * `total_row_count_column`) AND`rank_column_name` < (.25 * `total_row_count_column`) THEN 'Quartile 2'
WHEN`rank_column_name` >= (.5 * `total_row_count_column`) AND `rank_column_name` < (.75 * `total_row_count_column`) THEN 'Quartile 3'
WHEN`rank_column_name` >= (.75 * `total_row_count_column`) THEN 'Quartile 4'
END
Good luck as you work on your use case!
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2
Answers

artywah,
This sounds like something that you could do in mySQL or Magic ETL. I can think of a couple of ways you could do this:
 Median: You could find your median and then split the data so that the values below the median go into one transform and the ones above your median go into another (you would have to determine which half gets the actual median record). From each transform that has half of the data, you could repeat the same process, splitting the transform into two parts. This would result in 4 parts that represent your quartiles for the metric of interest (recency, frequency, or monetary).
 Ranking: You could rank your data by one of the metrics you referred to and then divide the data based on your total record count (the total would need to be added as a constant in its own column). Splitting up the data could be done with a case statement in SQL or the filter rows tile in Magic ETL. Here is an example of a case statement that could help get you to your end goal:
CASE
WHEN `rank_column_name` < (.25 * `total_row_count_column`) THEN 'Quartile 1'
WHEN `rank_column_name` >= (.25 * `total_row_count_column`) AND`rank_column_name` < (.25 * `total_row_count_column`) THEN 'Quartile 2'
WHEN`rank_column_name` >= (.5 * `total_row_count_column`) AND `rank_column_name` < (.75 * `total_row_count_column`) THEN 'Quartile 3'
WHEN`rank_column_name` >= (.75 * `total_row_count_column`) THEN 'Quartile 4'
END
Good luck as you work on your use case!
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2
Categories
 All Categories
 1.7K Product Ideas
 1.7K Ideas Exchange
 1.5K Connect
 1.2K Connectors
 292 Workbench
 4 Cloud Amplifier
 8 Federated
 2.8K Transform
 95 SQL DataFlows
 605 Datasets
 2.1K Magic ETL
 3.7K Visualize
 2.4K Charting
 696 Beast Mode
 43 App Studio
 39 Variables
 660 Automate
 170 Apps
 442 APIs & Domo Developer
 42 Workflows
 6 DomoAI
 33 Predict
 13 Jupyter Workspaces
 20 R & Python Tiles
 388 Distribute
 111 Domo Everywhere
 271 Scheduled Reports
 6 Software Integrations
 113 Manage
 110 Governance & Security
 9 Domo University
 31 Product Releases
 Community Forums
 40 Getting Started
 30 Community Member Introductions
 98 Community Announcements
 Domo Community Gallery
 4.8K Archive