Correlation Coefficient in Beastmode
Has anyone successfully calculated a Pearson Correlation Coefficient in Beastmode? My idea is to create a scatter plot that graphically displays the relationship between two fields in a dataset, and display the Correlation Coefficient for those two datasets as the Summary Number.
For example, if the card is filtered for a date range of March 1  March 31, the key fields in the underlying dataset might look like the following table, which would result in a Correlation Coefficient of 0.5669 (this is the value that I would want to display as the Summary Number for the card). Is this possible?
Thanks!
Revenue  Duration 
$603  185 
$262  123 
$906  287 
$429  179 
$872  266 
$302  160 
$185  126 
$596  293 
$481  234 
$173  104 
$628  11 
$296  194 
$272  46 
Best Answers

I also cannot get the corr() function to work. However I did get it to work by manually creating the sql math in a beastmode.
Use this code:
(count(x) * sum(x * y)  sum(x) * sum(y)) / (sqrt(count(x) * sum(x * x)  sum(x) * sum(x)) * sqrt(count(x) * sum(y * y)  sum(y) * sum(y)))
Then replace x and y with your measure names.
3 
How to build a correlation plot to compare multiple variables!
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2
Answers

You can do it in a data transform, but couldn't in a beastmode as far as my testing.
You'll need to create a data transform that calculates the Pearson R using the following logic:
(Avg(`Revenue` * `Duration`) – (Avg(`Revenue`) * Avg(`Duration`))) / (STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`))
You have to break it into 2 selects, the first:
SELECT Avg(`Revenue` * `Duration`) AS a,
(Avg(`Revenue`) * Avg(`Duration`)) AS b,
(STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`)) as c
FROM testingAnd the 2nd:
SELECT (`a``b`)/`c` as 'Pearson r'
FROm transform_data_1As you can see, this will give you the correct result:
From there you can basically do the following
SELECT *, (SELECT `Pearson R` from PearsonCalculation) as `Pearson`
FROM datasetThen use a beastmode for MAX(`Pearson R`) as your summary number. It won't be dynamic where you can change the date range, but maybe someone else can figure out a way to get the beastmode portion to work.
Sincerely,
ValiantSpur
**Please mark "Accept as Solution" if this post solves your problem
**Say "Thanks" by clicking the "heart" in the post that helped you.1 
Second option is to use the Corr function (which does a Pearson Correlation for you)
corr(`Duration`,`Revenue`)
You can use that as a beastmode but it doesn't let you put it in the summary number.
2 
Thanks for your ideas! I'll try the first one later on. I did try using the Corr() function in beastmode, but regardless of which fields I use I keep receiving the error "An issue has occurred during processing. We are unable to complete the request at this time." I noticed that Corr does not appear in the Functions list  perhaps it is in beta version?
0 
I don't believe so, we aren't running any beta versions in our production environment and I was able to get it to work. Perhaps make sure both fields are integers?
1 
I've tried multiple times to make the corr() function work, even creating a very simple test dataset that matches the one I provided earlier. No matter what I try, the formula validates, but I always receive the error
"An issue has occurred during processing. We are unable to complete the request at this time."Have you gotten the corr() function to work?
Thanks again for all your help
0 
Yes, I was actually able to use your test dataset, create a beastmode with the corr function and then it worked fine. It sounds like it may be worth reaching out to Domo Support at this point. There may be a quirk going on with your instance.
1 
I also cannot get the corr() function to work. However I did get it to work by manually creating the sql math in a beastmode.
Use this code:
(count(x) * sum(x * y)  sum(x) * sum(y)) / (sqrt(count(x) * sum(x * x)  sum(x) * sum(x)) * sqrt(count(x) * sum(y * y)  sum(y) * sum(y)))
Then replace x and y with your measure names.
3 
Smart workaround!
0 
How to build a correlation plot to compare multiple variables!
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2 
@jaeW_at_Onyx I cleaned up your post.
Going forward use the word editor click on video and drop the youtube url in there "from the web".
Keep these coming.
Cheers!
0
Categories
 10.5K All Categories
 8 Connect
 918 Connectors
 250 Workbench
 470 Transform
 1.7K Magic ETL
 69 SQL DataFlows
 477 Datasets
 193 Visualize
 252 Beast Mode
 2.1K Charting
 11 Variables
 17 Automate
 354 APIs & Domo Developer
 89 Apps
 3 Workflows
 20 Predict
 5 Jupyter Workspaces
 15 R & Python Tiles
 247 Distribute
 63 Domo Everywhere
 243 Scheduled Reports
 21 Manage
 42 Governance & Security
 174 Product Ideas
 1.2K Ideas Exchange
 12 Community Forums
 27 Getting Started
 14 Community Member Introductions
 55 Community News
 4.5K Archive