Standard deviation

Standard deviation is a statistical measure that quantifies the dispersion or spread of a data set relative to its mean (average). It indicates how much individual data points deviate from the mean; a smaller standard deviation signifies that the data points are closely clustered around the mean, while a larger standard deviation indicates a wider spread. There are two primary types of standard deviation: population standard deviation and sample standard deviation. The population standard deviation is used when the data set encompasses all members of a given population, whereas the sample standard deviation is applied when analyzing a subset that represents a larger group, providing a more accurate measure in such cases.

To calculate standard deviation, one first computes the variance, which involves determining the mean, calculating the squared differences from the mean, and averaging those squared differences. The square root of the variance gives the standard deviation. Both types assume a normal distribution of data; if this assumption holds, the empirical rule can be applied, indicating that a significant portion of data points falls within certain ranges of the mean. Understanding standard deviation is crucial for interpreting statistical data and assessing variability in various fields, including finance, psychology, and social sciences.

Published in: 2023

By: Tantawi, Randa, PhD

Subject Terms

Standard deviation

Standard deviation is a mathematical value that is used to show the degree to which a given data point might deviate from the average, or mean, of the set to which it belongs. A small standard deviation means the data points are clustered fairly close to the mean, while a larger value indicates that they are more spread out. The two main types of standard deviation are population standard deviation and sample standard deviation; which calculation to use depends on whether the data set being analyzed is complete or merely a representative sample of a larger set.

Overview

The standard deviation of a data set is the square root of the variance, which describes how far the data points are spread out from the mean. While the variance and the standard deviation are quite similar concepts, standard deviation is more useful in a real-world context, as it is expressed in the same units as the original data points, while the variance is expressed in those units squared. Thus, in order to determine the standard deviation of a data set, one must first determine the variance.

The first step in calculating the variance is to calculate the mean, which is done by adding all the values in the set together and then dividing them by the number of values in the set. Then subtract the mean from each individual value in the set and square each resulting difference. The goal of squaring the differences is to avoid dealing with negative numbers. Finally, calculate the mean of the squared differences by adding them all together and once again dividing by the number of values in the set. The number that results is the variance of the set. In order to determine the standard deviation, simply take the square root of the variance.

The above calculation is one of two basic formulas for standard deviation. It is often called the “population standard deviation” in order to differentiate it from the sample standard deviation. The population standard deviation is most accurate when the data points in the set represent the entirety of the data being analyzed. However, sometimes the data set is merely a sample of a larger population, and the results will be used to generalize about that larger population, such as when a fraction of a nation’s residents are polled on a political issue and the results are extrapolated to represent the political attitudes of the entire nation. In these cases, the population standard deviation generally produces a value that is too low, so the sample standard deviation should be used instead. Though it still does not produce an entirely unbiased result, it is significantly more accurate.

In order to calculate the sample standard deviation, one must recalculate the variance to produce a sample variance. The calculation is the same except for the last step. Instead of dividing the sum of the squared differences by the number of values in the set, divide the sum by the number of values in the set minus one. This corrects for the tendency for the population standard deviation to be too low. Then, as before, simply take the square root of the sample variance to produce the sample standard deviation.

It is important to note that both the population standard deviation and the sample standard deviation assume a normal distribution of data, represented by a bell curve. When the distribution pattern deviates from this norm, further corrections may be necessary to accurately calculate the standard deviation. However, when the distribution does follow a bell curve, the standard deviation can communicate a great deal about the values within the data set. For example, the empirical rule, also known as the “68-95-99.7 rule,” states that in a normal distribution, 68 percent of the data points in the set fall within one standard deviation of the mean, 95 percent fall within two standard deviations of the mean, and 99.7 percent fall within three standard deviations of the mean. Generally, outlying data points within a set are only considered statistically significant when they are more than one standard deviation from the mean, though the exact threshold of significance varies.

Bibliography

Altman, Douglas G., and J. Martin Bland. “Standard Deviations and Standard Errors.” BMJ 331.7521 (2005): 903. Print.

Hand, David J. Statistics: A Very Short Introduction. New York: Oxford UP, 2008. Print.

Kalla, Siddharth. “Calculate Standard Deviation.” Explorable. Explorable.com, 27 Sept. 2009. Web. 4 Oct. 2013.

Lane, David M. “Measures of Variability.” Online Statistics Education: An Interactive Multimedia Course of Study. Lane, n.d. Web. 4 Oct. 2013.

Orris, J. B. “A Visual Model for the Variance and Standard Deviation.” Teaching Statistics 33.2 (2011): 43–45. Print.

Taylor, Jeremy J. “Confusing Stats Terms Explained: Standard Deviation.” Stats Make Me Cry. Taylor, 1 Aug. 2010. Web. 4 Oct. 2013.

Urdan, Timothy C. Statistics in Plain English. 3rd ed. New York: Routledge, 2010. Print.

Weisstein, Eric W. “Standard Deviation.” Wolfram MathWorld. Wolfram Research, n.d. Web. 4 Oct. 2013.

Standard deviation

Related Topics

On this Page

Subject Terms

Standard deviation

Overview

Bibliography