correlation coefficient vs CramersV

Hello,

I would like to preface this by stating that my statistical knowledge base isn’t that strong. Onto the topic!

In an attempt to better understand emblem’s Cramers V statistic, I looked it up on Wikipedia. I then contrived a data set with two variables with randomly populated counting numbers from 0 to 10 (round(rand()*10,0) in Excel. About 2200 rows worth.

First I calculated the correlation coefficient using Excel’s Correl function, and got -0.03611 which is unsurprising as the numbers were all independently randomly generated, so I would expect to get a correlation coefficient close to 0. This would be indicative of independence (which is actually how the numbers were created.)

Then, I took these same values and calculated the Cramers V statistic in R. I used R because I did not know how to calculate the statistic in Excel. The Cramer’s V from R was 0.07405789.

I’m not sure how best to interpret these results. I know Cramers V is a different measure of the association between two variables, and I can see that both stats are close to zero, which I would expect, so this is good. I am a little surprised to see that the magnitude of my Cramers V is larger then my correlation coefficient, but should I be? Should I really have no expectation of what these stats should be beyond close to zero, or closer to +/- 1 ?

I like the Cramers V stat as it makes an adjustment for the number of observations, similar to how adjusted R squared makes an adjustment as well, so the stat’s don’t just monotonically increase as the number of observations increase. The flip side to that is I don’t think the correlation coefficient has any bias between more or fewer observations except getting closer to the true correlation the more observations you look at.

Anyone have any advice on how best to interpret the difference between the statistics? Or am I really just over thinking this, and high Cramers V means stronger relationship, lower Cramers V means more independent relationship, and that’s all I need to focus on?

Kind Regards,

Gareth Keenan


correlation coefficient vs CramersV