In the microibome papers terminologies, such as dependence, association and correlation are commonly used. It is important to understand the difference between these terminologies and use them properly. Altman and Krzywinski have reviewed this topic in their recent publication in Nature Methods. You can read the full paper here.

Below is the summary of some of the points that I found to be the key:

Independence: Two variables are independent when the value of one gives no information about the value of the other. 

Association is synonymous with dependence and is different from correlation. Association is a very general relationship: one variable provides information about another. Correlation is more specific: two variables are correlated when they display an increasing or decreasing trend. Thus, correlation is a type of association that measures increasing or decreasing trends quantified using correlation coefficients.

For quantitative and ordinal data, there are two primary measures of correlation: Pearson's correlation (r), which measures linear trends, and Spearman's (rank) correlation (s), which measures increasing and decreasing trends that are not necessarily linear. Like other statistics, these have population values, usually referred to as ρ. There are other measures of association that are also referred to as correlation coefficients, but which might not measure trends.

Read more within the paper about r and s values and what cautions should be taken into account when interpreting such values. It is possible to see large correlation coefficients even for random data. Thus, r should be reported together with a P value, which measures the degree to which the data are consistent with the null hypothesis that there is no trend in the population. That being said, because P depends on both r and the sample size, it should never be used as a measure of the strength of the association. It is possible for a smaller r, whose magnitude can be interpreted as the estimated effect size, to be associated with a smaller P merely because of a large sample size. Statistical significance of a correlation coefficient does not imply substantive and biologically relevant significance. 

The value of both coefficients will fluctuate with different samples as well as with the amount of noise and/or the sample size. With enough noise, the correlation coefficient can cease to be informative about any underlying trend. Thus, when the linear trend is masked by noise, larger samples are needed to confidently measure the correlation.

The Pearson correlation coefficient can also be used to quantify how much fluctuation in one variable can be explained by its correlation with another variable. A previous discussion about analysis of variance4 showed that the effect of a factor on the response variable can be described as explaining the variation in the response; the response varied, and once the factor was accounted for, the variation decreased. The squared Pearson correlation coefficient r2 has a similar role: it is the proportion of variation in Y explained by X (and vice versa). For example, r = 0.05 means that only 0.25% of the variance of Y is explained by X (and vice versa), and r = 0.9 means that 81% of the variance of Y is explained by X. This interpretation is helpful in assessments of the biological importance of the magnitude of r when it is statistically significant.

Association should not be confused with causality; if X causes Y, then the two are associated (dependent). However, associations can arise between variables in the presence (i.e., X causes Y) and absence (i.e., they have a common cause) of a causal relationship. The association merely suggests a hypothesis, such as a common cause, but does not offer proof. In addition, when many variables in complex systems are studied, spurious associations can arise. Thus, association does not imply causation. Because not all associations are correlations, and because causality, as discussed above, can be connected only to association, we cannot equate correlation with causality in either direction.

 

Comment