Correlation and Dependence

Dependence describes a statistical relationship between two random variables. Random variables are independent if the realization of one has no influence on the probability of any realization of the other. The standard illustration considers throwing two dice, where random variables would be the number of dots showing on either die. The value of the first random variable (1 to 6) has no influence on the value of the second random variable (also 1 to 6).

Correlation is a related idea developed by Francis Galton in the late nineteenth century. Statistical definitions of correlation vary, but they are quite specific in the way they describe the relationship between two variables. The Pearson product moment correlation coefficient is one common definition. It states that for two random variables X and Y:

98418274-96970.jpg

Here

98418274-96975.jpg

This formulation directly shows the meaning of the Pearson product moment correlation coefficient. If we have data where large values of

Figure 1(a) shows data that align perfectly on a straight line with a correlation of 1. Figure 1(b) shows a lower value of correlation, implying variation around some underlying line. Figure 1(c) shows no correlation. Figure 1(d) shows variation around a line with a negative slope, a negative correlation. There is less variation around a line than in Figure 1(b) because the absolute value of the correlation coefficient is higher.

98418274-96988.jpg

Because of the role of the mean and the assumption of linearity, there are, therefore, other correlation coefficients that attempt to describe a relationship between two variables, which may be more appropriate in other circumstances. Some of these, such as Spearman’s rank correlation coefficient or Kendal’s tau are also calibrated to lie in the range −1 to 1.

Bibliography

Blitzstein, J. K., and J. Hwant. Introduction to Probability. Boca Raton, FL: CRC, 2015.

Pishro-Nik, Hossein. Introduction to Probability, Statistics, and Random Processes. Kappa Research, 2014.

Shang, Du and Pengjian Shang. "The Dependence Index Based on Martingale Difference Correlation: An Efficient Tool to Distinguish Different Complex Systems." Expert Systems with Applications, vol. 213, 1 May 2023, doi.org/10.1016/j.eswa.2022.119284. Accessed 20 Nov. 2024.