Analysis of variance (ANOVA)

Analysis of variance (ANOVA) is a method for testing the statistical significance of any difference in means in three or more groups. The method grew out of British scientist Sir Ronald Aylmer Fisher’s investigations in the 1920s on the effect of fertilizers on crop yield. ANOVA is also sometimes called the F-test in his honor.

113931283-115581.jpg113931283-115580.jpg

Conceptually, the method is simple, but in its use, it becomes mathematically complex. There are several types, but the one-way ANOVA and the two-way ANOVA are among the most common. One-way ANOVA compares statistical means in three or more groups without considering any other factor. Two-way ANOVA is used when the subjects are simultaneously divided by two factors, such as patients divided by sex and severity of disease.

Background

In ANOVA, the total variance in subjects in all the data sets combined is considered according to the different sources from which it arises, such as between-group variance and within-group variance (also called "error sum of squares" or "residual sum of squares"). Between-group variance describes the amount of variation among the different data sets. For example, ANOVA may reveal that 50 percent of variation in some medical factor in healthy adults is due to genetic differentials, 30 percent due to age differentials, and the remaining 20 percent due to other factors. Such residual (in this case, the remaining 20 percent) left after the extraction of the factor effects of interest is the within-group variance. The total variance is calculated as the sum of squares total, equal to the sum of squares within plus the sum of squares between.

ANOVA can be used to test a hypothesis. The null hypothesis states that there is no difference between the group means, while the alternative hypothesis states that there is a difference (that the null hypothesis is false). If there are genuine differences between the groups, then the between-group variance should be much larger than the within-group variance; if the differences are merely due to random chance, the between-group and within-group variances will be close. Thus, the ratio between the between-group variance (numerator) and the within-group variance (denominator) can be to determine whether the group means are different and therefore prove whether the null hypothesis is true or false. This is what the F-test does.

In performing ANOVA, some kind of random sampling is required in order to test the validity of the procedure. The usual ANOVA considers groups on what is called a "nominal basis," that is, without order or quantitative implications. This implies that if one’s groups are composed of cases with mild disease, moderate disease, serious disease, and critical cases, the usual ANOVA would ignore this gradient. Further analysis would study the effect of this gradient on the outcome.

Overview

Among the requirements for the validity of ANOVA are

  • statistical independence of the observations
  • all groups have the same variance (a condition known as "homoscedasticity")
  • the distribution of means in the different groups is Gaussian (that is, following a normal distribution, or bell curve)
  • for two-way ANOVA, the groups must also have the same sample size

Statistical independence is generally the most important requirement. This is checked using the Durbin-Watson test. Observations made too close together in space or time can violate independence. Serial observations, such as in a time series or repeated measures, also violate the independence requirement and call for repeated-measures ANOVA.

The last criterion is generally fulfilled due to the central limit theorem when the sample size in each group is large. According to the central limit theorem, as sample size increases, the distribution of the sample means or the sample sums approximates normal distribution. Thus, if the number of subjects in the groups is small, one should be alert to the different groups’ pattern of distribution of the measurements and of their means. It should be Gaussian. If the distribution is very far from Gaussian or the variances really unequal, another statistical test will be needed for analysis.

The practice of ANOVA is based on means. Any means-based procedure is severely perturbed when outliers are present. Thus, before using ANOVA, there must be no outliers in the data. If there are, do a sensitivity test: examine whether the outliers can be excluded without affecting the conclusion.

The results of ANOVA are presented in an ANOVA table. This contains the sums of squares, their respective degrees of freedom (df; the number of data points in a sample that can vary when estimating a parameter), respective mean squares, and the values of F and their statistical significance, given as p-values. To obtain the mean squares, the sum of squares is divided by the respective df, and the F values are obtained by dividing each factor’s mean square by the mean square for the within-group. The p-value comes from the F distribution under the null hypothesis. Such a table can be found using any statistical software of note.

A problem in the comparison of three or more groups by the criterion F is that its statistical significance indicates only that a difference exists. It does not tell exactly which group or groups are different. Further analysis, called "multiple comparisons," is required to identify the groups that have different means.

When no statistical significant difference is found across groups (the null hypothesis is true), there is a tendency to search for a group or even subgroup that stands out as meeting requirements. This post-hoc analysis is permissible so long as it is exploratory in nature. To be sure of its importance, a new study should be conducted on that group or subgroup.

Bibliography

Doncaster, P., and A. Davey. Analysis of Variance and Covariance: How to Choose and Construct Models for the Life Sciences. Cambridge UP, 2007.

Fox, J. Applied Regression Analysis and Generalized Linear Models. 3rd ed. Sage, 2016.

Jones, James. "Stats: One-Way ANOVA." Statistics: Lecture Notes, Richland Community College, Wpeople.richland.edu/james/lecture/m170/ch13-1wy.html. Accessed 19 Sept. 2024.

Kabacoff, R. R in Action: Data Analysis and Graphics with R. Manning, 2015.

Kenton, Will. "What Is Analysis of Variance (ANOVA)?" Investopedia, 30 July 2024, www.investopedia.com/terms/a/anova.asp. Accessed 19 Sept. 2024.

Lunney, G. H. "Using Analysis of Variance with a Dichotomous Dependent Variable: An Empirical Study." Journal of Educational Measurement 7 (1970): 263–69.

Streiner, D. L., G. R. Norman, and J. Cairney. Health Measurement Scales: A Practical Guide to Their Development and Use. Oxford UP, 2014.

Zhang, J., and X. Liang. "One-Way ANOVA for Functional Data via Globalizing the Pointwise F-test." Scandinavian Journal of Statistics 41 (2014): 51–74.