Calculate correlation coefficient in beastmode? Do not want to use correlation chart.

Hi,

I need a formula in beastmode to calculate the correlation coefficient between two variables?

Thanks,

Paul

IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

Tagged:

Best Answers

  • DavidChurchman
    Answer ✓

    Maybe this is outdated, but I linked a discussion of how CORR() doesn't work in BeastMode, but gives the manual formula (credit to @Dean_Wangerin):

    I was able to test and confirm this BeastMode can roll up correctly by date and by week:

    (count(X) * sum(X * Y) - sum(X) * sum(Y)) / (sqrt(count(X) * sum(X * X) - sum(X) * sum(X)) * sqrt(count(X) * sum(Y * Y) - sum(Y) * sum(Y)))

    Keep in mind that a correlation is finding the best slope (Y/X) of a line that goes through points, so you need at least two points and those points can't be vertical (or average to a vertical slope), or your correlation will be undefined.

    Please 💡/💖/👍/😊 this post if you read it and found it helpful.

    Please accept the answer if it solved your problem.

  • ColemenWilson
    edited January 16 Answer ✓

    I just compared the result of CORR() in a beastmode and it matches what I am getting in Excel/Google Sheets with CORREL(). So it definitely works.

    If I solved your problem, please select "yes" above

Answers

  • There is a CORR() function you can use in a beastmode. Example: CORR(field1,field2)

    If I solved your problem, please select "yes" above

  • pauljames
    pauljames Contributor

    @ColemenWilson , i don't see this beastmode available.

    IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

  • ColemenWilson
    edited January 15

    It works. Give it a try and let me know how it goes! Example: CORR(field1,field2)

    There are many functions that work in beastmode that are not displayed in the functions list in beastmode editor.

    If I solved your problem, please select "yes" above

  • pauljames
    pauljames Contributor

    IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

  • Hmm I didn't realize you wanted to see the correlation coefficient change over time. For that you'd need to use fixed by. Something like this:
    CORR(Field1,Field2) FIXED (BY Date)

    If I solved your problem, please select "yes" above

  • pauljames
    pauljames Contributor

    @ColemenWilson ok, i think we are close. still not working though…we've got dates returning

    IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

  • Hmm I am not getting the same error and I've applied the same settings as you. Can you share your beastmode? Also, if you switch the chart type to a table card and then do no aggregation and use CORR(field1,field2) do you get the value you are expecting?

    If I solved your problem, please select "yes" above

  • pauljames
    pauljames Contributor

    No. I don't.

    IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

  • pauljames
    pauljames Contributor

    IF I SOLVED YOUR PROBLEM, PLEASE "ACCEPT" MY ANSWER AS A SOLUTION. THANK YOU!

  • Are you using numeric values? It won't work with dates, text, or any other dimension - only numbers.

    If I solved your problem, please select "yes" above

  • DavidChurchman
    Answer ✓

    Maybe this is outdated, but I linked a discussion of how CORR() doesn't work in BeastMode, but gives the manual formula (credit to @Dean_Wangerin):

    I was able to test and confirm this BeastMode can roll up correctly by date and by week:

    (count(X) * sum(X * Y) - sum(X) * sum(Y)) / (sqrt(count(X) * sum(X * X) - sum(X) * sum(X)) * sqrt(count(X) * sum(Y * Y) - sum(Y) * sum(Y)))

    Keep in mind that a correlation is finding the best slope (Y/X) of a line that goes through points, so you need at least two points and those points can't be vertical (or average to a vertical slope), or your correlation will be undefined.

    Please 💡/💖/👍/😊 this post if you read it and found it helpful.

    Please accept the answer if it solved your problem.

  • ColemenWilson
    edited January 16 Answer ✓

    I just compared the result of CORR() in a beastmode and it matches what I am getting in Excel/Google Sheets with CORREL(). So it definitely works.

    If I solved your problem, please select "yes" above

  • You're right. I played around and got CORR(`X`, Y) FIXED(BY Date) (with and average aggregation) to work, with some big caveats:

    • If the standard deviation of either X or Y for a given date is 0, then you'll get a null value. So if you have flat values or vertical values on a given date, or only one value at a given date, you'll get a null result at that date, and then if you average that correlation into a week, you'll get a null for the whole week.
    • You could overcome part of that by adjusting your function to CORR(`X`, Y) FIXED(BY week(Date) ), but then your function won't adjust to the "Graph by" [week, month…] option
    • But let's say you have variance on every day of your dataset and you don't add 'week' to your function, so you can still use the 'Graph by'. You should be aware that averaging the results of the correlation is averaging each day's correlation. If you have a day with more data than another, this might not be the true correlation for the week. For example, if you have two values on Monday with a correlation of 0 and 18 values on Friday with a correlation of 1, you'll get an average of 0.5, when the true correlation is 0.9. (Unless you want each day to be weighted exactly equally).

    Here's an example, showing how the CORR(`X`, Y) FIXED(BY Date) produces a null value for a week with a single day with no variance, and a slightly different result than the manual equation for a week with a single day with more value pairs than another day:

    Please 💡/💖/👍/😊 this post if you read it and found it helpful.

    Please accept the answer if it solved your problem.