# Calculate correlation coefficient in beastmode? Do not want to use correlation chart.

Contributor

Hi,

I need a formula in beastmode to calculate the correlation coefficient between two variables?

Thanks,

Paul

Tagged:

• Coach

Maybe this is outdated, but I linked a discussion of how CORR() doesn't work in BeastMode, but gives the manual formula (credit to @Dean_Wangerin):

I was able to test and confirm this BeastMode can roll up correctly by date and by week:

`(count(X) * sum(X * Y) - sum(X) * sum(Y)) / (sqrt(count(X) * sum(X * X) - sum(X) * sum(X)) * sqrt(count(X) * sum(Y * Y) - sum(Y) * sum(Y)))`

Keep in mind that a correlation is finding the best slope (Y/X) of a line that goes through points, so you need at least two points and those points can't be vertical (or average to a vertical slope), or your correlation will be undefined.

• Coach

I just compared the result of CORR() in a beastmode and it matches what I am getting in Excel/Google Sheets with CORREL(). So it definitely works.

• Coach

There is a CORR() function you can use in a beastmode. Example: CORR(field1,field2)

• Contributor

@ColemenWilson , i don't see this beastmode available.

• Coach
edited January 15

It works. Give it a try and let me know how it goes! Example: CORR(field1,field2)

There are many functions that work in beastmode that are not displayed in the functions list in beastmode editor.

• Contributor

• Coach

Hmm I didn't realize you wanted to see the correlation coefficient change over time. For that you'd need to use fixed by. Something like this:
CORR(`Field1`,`Field2`) FIXED (BY `Date`)

• Contributor

@ColemenWilson ok, i think we are close. still not working though…we've got dates returning

• Coach

Hmm I am not getting the same error and I've applied the same settings as you. Can you share your beastmode? Also, if you switch the chart type to a table card and then do no aggregation and use CORR(field1,field2) do you get the value you are expecting?

• Contributor

No. I don't.

• Contributor

• Coach

Are you using numeric values? It won't work with dates, text, or any other dimension - only numbers.

• Coach

Maybe this is outdated, but I linked a discussion of how CORR() doesn't work in BeastMode, but gives the manual formula (credit to @Dean_Wangerin):

I was able to test and confirm this BeastMode can roll up correctly by date and by week:

`(count(X) * sum(X * Y) - sum(X) * sum(Y)) / (sqrt(count(X) * sum(X * X) - sum(X) * sum(X)) * sqrt(count(X) * sum(Y * Y) - sum(Y) * sum(Y)))`

Keep in mind that a correlation is finding the best slope (Y/X) of a line that goes through points, so you need at least two points and those points can't be vertical (or average to a vertical slope), or your correlation will be undefined.

• Coach

I just compared the result of CORR() in a beastmode and it matches what I am getting in Excel/Google Sheets with CORREL(). So it definitely works.

• Coach

You're right. I played around and got CORR(`X`, `Y`) FIXED(BY `Date`) (with and average aggregation) to work, with some big caveats:

• If the standard deviation of either X or Y for a given date is 0, then you'll get a null value. So if you have flat values or vertical values on a given date, or only one value at a given date, you'll get a null result at that date, and then if you average that correlation into a week, you'll get a null for the whole week.
• You could overcome part of that by adjusting your function to CORR(`X`, `Y`) FIXED(BY week(`Date`) ), but then your function won't adjust to the "Graph by" [week, month…] option
• But let's say you have variance on every day of your dataset and you don't add 'week' to your function, so you can still use the 'Graph by'. You should be aware that averaging the results of the correlation is averaging each day's correlation. If you have a day with more data than another, this might not be the true correlation for the week. For example, if you have two values on Monday with a correlation of 0 and 18 values on Friday with a correlation of 1, you'll get an average of 0.5, when the true correlation is 0.9. (Unless you want each day to be weighted exactly equally).

Here's an example, showing how the CORR(`X`, `Y`) FIXED(BY `Date`) produces a null value for a week with a single day with no variance, and a slightly different result than the manual equation for a week with a single day with more value pairs than another day: