Correlation Coefficient in Beastmode
Has anyone successfully calculated a Pearson Correlation Coefficient in Beastmode? My idea is to create a scatter plot that graphically displays the relationship between two fields in a dataset, and display the Correlation Coefficient for those two datasets as the Summary Number.
For example, if the card is filtered for a date range of March 1  March 31, the key fields in the underlying dataset might look like the following table, which would result in a Correlation Coefficient of 0.5669 (this is the value that I would want to display as the Summary Number for the card). Is this possible?
Thanks!
Revenue  Duration 
$603  185 
$262  123 
$906  287 
$429  179 
$872  266 
$302  160 
$185  126 
$596  293 
$481  234 
$173  104 
$628  11 
$296  194 
$272  46 
Best Answers

I also cannot get the corr() function to work. However I did get it to work by manually creating the sql math in a beastmode.
Use this code:
(count(x) * sum(x * y)  sum(x) * sum(y)) / (sqrt(count(x) * sum(x * x)  sum(x) * sum(x)) * sqrt(count(x) * sum(y * y)  sum(y) * sum(y)))
Then replace x and y with your measure names.
4 
How to build a correlation plot to compare multiple variables!
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2
Answers

You can do it in a data transform, but couldn't in a beastmode as far as my testing.
You'll need to create a data transform that calculates the Pearson R using the following logic:
(Avg(`Revenue` * `Duration`) – (Avg(`Revenue`) * Avg(`Duration`))) / (STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`))
You have to break it into 2 selects, the first:
SELECT Avg(`Revenue` * `Duration`) AS a,
(Avg(`Revenue`) * Avg(`Duration`)) AS b,
(STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`)) as c
FROM testingAnd the 2nd:
SELECT (`a``b`)/`c` as 'Pearson r'
FROm transform_data_1As you can see, this will give you the correct result:
From there you can basically do the following
SELECT *, (SELECT `Pearson R` from PearsonCalculation) as `Pearson`
FROM datasetThen use a beastmode for MAX(`Pearson R`) as your summary number. It won't be dynamic where you can change the date range, but maybe someone else can figure out a way to get the beastmode portion to work.
Sincerely,
ValiantSpur
**Please mark "Accept as Solution" if this post solves your problem
**Say "Thanks" by clicking the "heart" in the post that helped you.1 
Second option is to use the Corr function (which does a Pearson Correlation for you)
corr(`Duration`,`Revenue`)
You can use that as a beastmode but it doesn't let you put it in the summary number.
2 
Thanks for your ideas! I'll try the first one later on. I did try using the Corr() function in beastmode, but regardless of which fields I use I keep receiving the error "An issue has occurred during processing. We are unable to complete the request at this time." I noticed that Corr does not appear in the Functions list  perhaps it is in beta version?
0 
I don't believe so, we aren't running any beta versions in our production environment and I was able to get it to work. Perhaps make sure both fields are integers?
1 
I've tried multiple times to make the corr() function work, even creating a very simple test dataset that matches the one I provided earlier. No matter what I try, the formula validates, but I always receive the error
"An issue has occurred during processing. We are unable to complete the request at this time."Have you gotten the corr() function to work?
Thanks again for all your help
0 
Yes, I was actually able to use your test dataset, create a beastmode with the corr function and then it worked fine. It sounds like it may be worth reaching out to Domo Support at this point. There may be a quirk going on with your instance.
1 
I also cannot get the corr() function to work. However I did get it to work by manually creating the sql math in a beastmode.
Use this code:
(count(x) * sum(x * y)  sum(x) * sum(y)) / (sqrt(count(x) * sum(x * x)  sum(x) * sum(x)) * sqrt(count(x) * sum(y * y)  sum(y) * sum(y)))
Then replace x and y with your measure names.
4 
Smart workaround!
0 
How to build a correlation plot to compare multiple variables!
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"2 
@jaeW_at_Onyx I cleaned up your post.
Going forward use the word editor click on video and drop the youtube url in there "from the web".
Keep these coming.
Cheers!
0
Categories
 All Categories
 1.2K Product Ideas
 1.2K Ideas Exchange
 1.3K Connect
 1.1K Connectors
 273 Workbench
 Cloud Amplifier
 3 Federated
 2.7K Transform
 78 SQL DataFlows
 524 Datasets
 2.1K Magic ETL
 2.9K Visualize
 2.2K Charting
 434 Beast Mode
 22 Variables
 510 Automate
 114 Apps
 388 APIs & Domo Developer
 8 Workflows
 26 Predict
 10 Jupyter Workspaces
 16 R & Python Tiles
 331 Distribute
 76 Domo Everywhere
 255 Scheduled Reports
 65 Manage
 65 Governance & Security
 1 Product Release Questions
 Community Forums
 40 Getting Started
 26 Community Member Introductions
 67 Community Announcements
 4.8K Archive