Normal distribution

SUMMARY: Better known as the bell curve, there are many applications for normal distribution.

The normal distribution is one of the most useful and important probability distributions, with a wide range of theoretical and real-world applications. Many people know the normal distribution primarily by its colloquial name, the “bell curve,” which comes from its characteristic shape: a symmetric curve with a pronounced peak in the middle and diminishing tails. Mathematically, normal distributions are a family of continuous probability distributions. The normal function has no closed-form integral, but areas under the curve, which correspond to probabilities, can be accurately approximated with methods like numerical integration. All normal probability distributions display the same symmetric bell shape, but can have any real-valued mean (μ) and positive real-valued standard deviation (σ). The standard normal distribution is a special case with a mean of zero and standard deviation of one. All normal distributions can be transformed or standardized to the standard normal, which is theoretically important and extensively tabulated. Computers and calculators also allow direct calculation of normal probabilities. Students often use both technology and tables when they study the normal distribution in high school and beyond.

94981990-91519.jpg94981990-91518.jpg

Many naturally occurring phenomena are normally or approximately normally distributed, like the heights of adult human beings. In other cases, such as intelligence tests, the measurements are purposely structured or scaled according to this distribution. Several other probability distributions converge to the normal distribution or are well approximated by it. The central limit theorem, based on normal approximations, is the foundation for a wide range of commonly used statistical procedures, particularly for estimation and inference. Another common name for the normal distribution is the “Gaussian distribution,” after Carl Friedrich Gauss, whose work significantly advanced many statistical theories and concepts. Occasionally it is referred to as the “Laplace distribution,” after Pierre-Simon Laplace. The variety of names for the normal distribution likely reflects the debate on the origins of the term “normal distribution” and the breadth of people who influenced its development.

History

The first appearance of the term “normal distribution” in a published document is often credited to a seminal paper from Karl Pearson in 1895. However, there are some who say the first use corresponds to Charles Peirce in 1783, to Francis Galton in 1889, or to Henri Poincaré in 1893. Statistician and historian Stephen Stigler believes that it might have been used much earlier, and there is certainly evidence to support that assertion.

Abraham DeMoivre is credited with the first mathematical derivation of the normal distribution in his 1733 work Approximatio ad summam terminorum binomii (a + b)n in seriem expansi. Using sums of Bernoulli’s binomial random variables, he approximated a continuous distribution to the discrete binomial using integral calculus, which resulted in a bell-shaped continuous distribution. Continuing this idea, Pierre-Simon Laplace presented the central limit theorem in 1778, which is also sometimes called the “DeMoivre–Laplace theorem.” In fact, the name “central limit theorem” is credited to George Pólya’s 1920 work on the normal distribution. Since the central limit theorem is the limit of a summation of binary variables, it is applicable to both discrete and continuous random variables. It has many real world applications along with its theoretical importance, and it is fundamental to statistical inference.

Robert Adrain, an American, and Carl Friedrich Gauss, a German, worked simultaneously on similar notions at the start of the nineteenth century without being aware of each other’s work. In 1808, Adrain presented arguments regarding the validity of the normal distribution for describing distributions of measurement errors, inspired by a real-world problem in surveying. He used this initial work to further develop and prove Adrien-Marie Legendre’s method of least squares. Gauss published his Theory of Celestial Movement in 1809. This work included several critical contributions to mathematics and statistics, including the maximum likelihood parameter estimation, the method of least squares, and the normal distribution. This is perhaps part of the reason that Gauss tends to be given credit over Adrain for their similar contributions regarding the normal distribution.

In 1829, Adolphe Quetelet brought the concept of the normal distribution of error terms into the analysis of social data. He wanted to discover the underlying laws of society in the same way other researchers were exploring scientific and mathematical laws. Quetelet invented the term “social physics” and empirically developed the first notions of the measure now called “body mass index.” He analyzed several data sets of human biological and social data, such as the heights and weights of conscripted soldiers, and by inductively using the central limit theorem, he concluded that the normal error distribution described these measures quite well. Galton also contributed to the application and development of the normal distribution in the biological and social sciences. He produced the first known index of correlation as well as regression analysis, and he proved that a normal mixture of normal distributions is itself normal. His colleagues Walter Weldon and Karl Pearson also contributed to normal theory and applications, and the three of them cofounded the journal Biometrika. The field of biometrics is generally traced back to Weldon’s seminal papers. Pearson used the method of moments to estimate mixtures of normal distributions and further developed correlation and regression methods based on the normal distribution. However, part of his motivation for developing methods like chi-square analyses was apparently to try to decrease the growing reliance on the normal distribution as a foundation of statistical theory and analytic methods.

Pearson’s efforts to diminish the role of the normal distribution in statistics failed. Many other mathematicians and statisticians, including Pearson’s son Egon, continued to develop theory and applications in a variety of areas. For example, William Gossett and Ronald Fisher derived and refined the closely related Student’s t distribution in the early twentieth century. The distribution is not called Gosset’s t because he worked for Guinness Brewery and he could not publish his work in his own name because of proprietary issues, so he adopted the pseudonym “Student.” Starting in the 1930s, Samuel Wilks explored many aspects of normal distributions. These included deriving sampling distributions for parameter estimates in bivariate normal distributions as well as for covariances in multivariate normal distributions, which led to important advances in multivariate statistical methods. The American Statistical Association’s Wilks Award is one of the most prestigious in the field of statistics. Miroslaw Romanowski published a generalized theory of modified normal distributions in 1968 that help characterize errors that do not seem to be well-described by the normal distribution. Another such generalization is the skew normal. Other related distributions include the “lognormal distribution” or “Galton distribution,” which describes a variable whose log is normally distributed, and the “folded normal,” which is based on taking the absolute value of a normal distribution.

The term “bell curve” became even more widely known in 1994 when psychologist Richard Herrnstein and political scientist Charles Murray wrote The Bell Curve, which took its name from the distribution of IQ scores and included a picture of the normal distribution on its front cover. Herrnstein and Murray correlated intelligence scores with social outcomes and asserted that social stratification based on intelligence was on the rise. The book remains highly controversial for the authors’ inclusion of discussions regarding supposed relationships between race and intelligence and has spurred many debates on both social and statistical matters.

Bibliography

DiMaria, Lauren. "What Is a Normative Group in Psychology?" Reviewed by Adah Chung. Verywell Mind, 11 Aug. 2023, www.verywellmind.com/normative-group-1067184. Accessed 27 Sept. 2024.

Heyde, Chris, and Eugene Seneta, eds. Statisticians of the Centuries. New York: Springer-Verlag, 2001.

McLeod, Saul, PhD. "Introduction to the Normal Distribution (Bell Curve)." Reviewed by Olivia Guy-Evans, MSc. Simply Psychology, 11 Oct. 2023, www.simplypsychology.org/normal-distribution.html. Accessed 27 Sept. 2024.

Stigler, Stephen M. The History of Statistics. Cambridge, MA: Harvard University Press, 1986.