History of Probability
The history of probability is a rich exploration of humanity's attempts to understand and quantify uncertainty and chance. Early concepts of probability were intertwined with philosophical and religious beliefs, as ancient cultures grappled with the ideas of fate and randomness. The formal study of probability began to take shape in the seventeenth century, spurred by mathematicians such as Blaise Pascal and Pierre Fermat, who addressed gambling problems and laid the groundwork for modern probability theory. Over time, significant developments arose, including Jakob Bernoulli's Law of Large Numbers and Pierre-Simon Laplace's influential works that expanded probability's applications to areas like demographics and insurance.
Throughout history, probability has evolved into distinct approaches, notably the frequentist perspective, which relies on empirical data, and the subjective Bayesian approach, which incorporates personal belief and prior knowledge. Key contributions from mathematicians like Thomas Bayes and Karl Friedrich Gauss further refined probability concepts, leading to important distributions like the normal distribution. The field has since found applications in diverse areas, from genetics to finance, reflecting its integral role in various scientific and practical disciplines. Despite its long history, the definition and interpretation of probability remain subjects of debate, highlighting the complexity and ongoing intrigue of this foundational concept in mathematics and science.
History of Probability
Summary: Humans have implicitly understood concepts of probability and randomness since antiquity, but these concepts have been more formally studied since the seventeenth century.
Throughout history, humans have used many methods to try to predict the future. Some believed that the future was already laid out for them by a divine power or fate, while others seem to have believed that the future was uncertain. There are still debates on the extent to which people were able to speculate on the future prior to the development of statistics in the seventeenth and eighteenth centuries. Some assert that such speculations were impossible, yet other historical evidence suggests that at least some people must have been able to perceive the world in terms of risks or chances, even if it was not in quite the same way as later mathematicians and statisticians. The Greek philosopher Aristotle proposed that events could be divided into three groups: deterministic or certain events, chance or probable events, and unknowable events. The idea of “randomness” is often used to indicate completely unknowable events that cannot be predicted. In mathematics, the long-term outcomes of random systems are, in fact, “knowable” or describable using various rules of probability. Probability distributions, expressed as tables, graphs, or functions, show the relationship between all possible outcomes of some experiment or process, like rolling a die, and the chance that those outcomes will happen. For example, lotteries state the chances of winning various prizes, and people seeking medical treatment might be told the odds of success. Random processes and probability can run counter to human intuition and the way in which human brains perceive and organize information, which is perhaps another reason that quantifying ideas of probability is still an ongoing endeavor. Students are often introduced to probability concepts in the earliest elementary grades, such as basic binary classifications of outcomes as “likely” or “unlikely” and the notion of probabilities as experimental frequencies. More formal axioms of probability may be introduced in the middle grades. Probability theory and probability-based mathematical statistics are typically studied in college, though they may be included in advanced high school classes. Some elements of probability theory and applications are also taught in other academic disciplines, like business, genetics, and quantum mechanics.
![Probability of base call errors. Relationship between Q and p using the Sanger (red) and Solexa (black) equations. The vertical dotted line indicates p = 0.05, or equivalently, Q ≈ 13. For more information, see FASTQ_format By Sealox (Own work) [Public domain], via Wikimedia Commons 94982013-91543.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94982013-91543.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
![Probability tree By Erzbischof (Own work) [CC0], via Wikimedia Commons 94982013-91544.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94982013-91544.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
Early History
Archaeological evidence, such as astragalus bones found at ancient sites, suggests that games of chance have been around for several millennia or longer. Egyptian tomb paintings show astragali being used for games like Hounds and Jackals, much like the way twenty-first-century game players use dice. The ideas of randomness that underlie probability were often closely tied to philosophy and religion. Many ancient cultures embraced the notion of a deterministic fate. The Greek pantheon was among those that included deities associated with determinism, literally known as the Fates. The popular goddess Fortuna in the Roman pantheon suggests a recognition of the role of chance in the world. Jainism is an Indian religion with ancient roots, whose organized form appears to have originated sometime between about the ninth and sixth centuries b.c.e. The Jainist logic system known as syadvada includes concepts related to probability; its sanskrit root word syat translates variously as “may be” or “is possible.” Probability is also a component of the body of Talmudic scholarship; for example, the notion of casting lots, used in some temple functions. Babylonians had a type of insurance to protect against the risk of loss for sea voyages, called “bottomry,” as did the Romans and Venetians.
Origins of Study in the Seventeenth Century
Given the near omnipresence of probability in the ancient world, it seems reasonable to think that there were some efforts to estimate or calculate probabilities, at least on a case-specific basis; for example, those who issued maritime insurance would have assigned some type of monetary values for cost and payoff. There is relatively little evidence of broad mathematical research on probability before about the fifteenth century, though some analyses for specific cases survive. For example, a Latin poem by an unknown author called “De Ventula” describes all the ways that three dice can fall. Mathematician and friar Luca Paccioli wrote Summa de arithmetica, geometria, proportioni e proportionalita in 1494, which contains some discussion of probability. A few other works address dice rolls and related ideas. Historians tend to agree that the systematic mathematical study of probability as it is now known originated in the seventeenth century. At the time, considerable tensions still existed between the philosophies of religion, science, determinism, and randomness. Determinists asserted that the universe was the perfect work of a divine creator, ruled by mathematical functions waiting to be discovered, and that any apparent randomness was because of faults in human perception. Many emerging scientific theories, like the heliocentric model of the universe advocated by mathematician and astronomer Nicolaus Copernicus, challenged this view by explicitly exploring and quantifying variation and deviations in observations. Astronomy and other sciences, along with the rise of combinatorial algebra and calculus, would ultimately prove to be very influential in the development of probability theory. Changes in business practices also challenged notions of risk, requiring new methods by which likelihood and payoffs could be determined. Harkening back to ancient human activities, however, the most popular story for the origin of probability theory concerns gambling questions posed to mathematician Blaise Pascal by Antoine Gombaud, Chevalier de Méré.
In 1654, the Chevalier de Méré presented two problems. One concerned a game where a pair of six-sided dice was thrown 24 times, betting that at least one pair of sixes would occur. Méré’s attempts at calculation contradicted the conventional wisdom of the time and purportedly led him to lose as great deal of money. The second problem, now called the Problem of Points or Problem of Stakes, concerned fair division for a pot of money for a prematurely terminated game between equally skilled players where the winner of a completed game would normally take the whole pot. Spurred by de Méré’s queries, Pascal and Pierre Fermat exchanged a series of letters in which they formulated the fundamental principles of general probability theory.
At the time of its development, Pascal and Fermat’s burgeoning theory was commonly referred to as “the doctrine of chances.” Inspired by their work, mathematician and astronomer Christian Huygens published De Ratiociniis in Ludo Aleae in 1657, which discussed probability issues for gambling problems. Jakob (also known as James) Bernoulli explored probability theory beyond gambling into areas like demography, insurance, and meteorology and he composed an extensive commentary on Huygen’s book. One of his most significant contributions was the Law of Large Numbers for the binomial distribution, which stated that observed relative frequencies of events become more stable, approaching the true value, as the number of observations increases. Prior definitions based on gambling games tended to assume that all outcomes were equally likely, which was generally true for games with inherent symmetry like throwing dice. This extension allowed for empirical inference of unequal chances for many real-world applications. Bernoulli also wrote Ars Conjectandi. Influenced by this work, mathematician Abraham de Moivre derived approximations to the binomial probability distribution, including what many consider to be the first occurrence of the normal probability distribution, and his The Doctrine of Chances was the primary probability textbook for many years.
Objective and Subjective Approaches
Historically and philosophically, many people have asserted that to be objective, science must be based on empirical observations rather than subjective opinion. Estimating probabilities through direct observations is usually called the “frequentist approach.” The method of inverse or inductive probability, which allows for subjective input into the estimation of probabilities, is traced back to the posthumously published work of eighteenth-century minister and mathematician Thomas Bayes. Conditional probabilities had already been explored by de Moivre, providing the basis for what is known as “Bayes theorem” (or “Bayes rule”). In Bayes’s inductive framework, there is some probability that a binary event occurs. A frequentist would make no assumptions about the probability and carry out experiments to attempt to determine the true probability value. Using Bayes’s approach, some probability value can be arbitrarily chosen, and then experiments conducted to ascertain the likelihood that the value is in fact the correct one. In later interpretations and applications of the method, the initial value might be chosen according to experience or subjective criteria. His work also produced the Beta probability distribution. Bayes’s writings contained no data or examples, though they were extended upon and presented by minister Richard Price. At the time, they were relatively less influential than frequentist works, though Bayesian methods have generated much discussion and saw a great resurgence in the latter twentieth century.
Applications
Like Bernoulli, Pierre de Laplace extended probability to many scientific and practical problems, and his probability work led to research in other mathematical areas such as difference equations, generating functions, characteristic functions, asymptotic expansions of integrals, and what are called “Laplace transforms.” Some call his 1812 book, Théorie Analytique des Probabilités, the single most influential work in the history of probability. The Central Limit Theorem, named for George Pólya’s 1920 work and sometimes called the DeMoivre–Laplace theorem, was critical to the development of statistical methods and partly validated the common practice at the time (still used in the twenty-first century) of calculating averages or arithmetic means of observations to estimate location parameters. Error estimates were usually assumed to follow some symmetric probability distribution, such as rectangular, quadratic, or double exponential. While they had many useful properties, they were mathematically problematic when it came to deriving the sampling distributions of means for parameter estimation. Laplace’s work, which he proved for both direct and inverse paradigms, rectified the problem for large-sample cases and formed the foundation for large sample theory.
Normal Distribution
The normal distribution is among the most central concepts in probability theory and statistics. Many other probability distributions may be approximated by the normal because they converge to the normal as the number of trials or sample sizes approach infinity. Some of these include the binomial and Poisson distributions, the latter named for mathematician Simeon Poisson. The Central Limit Theorem depends on this principle. Mathematician Karl Friedrich Gauss is often credited with “inventing” the normal (or Gaussian) distribution, though others had researched it and Gauss’s own notes refer to “the elegant theorem first discovered by Laplace.” He can fairly be credited with the derivation of the parameterization of the distribution, which relied in part on inverse probability. Mathematician Robert Adrain, who was apparently unaware of Gauss’s work, discussed the validity of the normal distribution for describing measurement errors in 1808. His work was inspired by a real-world surveying problem. However, Gauss tends to be credited over Adrain, perhaps because of his many publications and the overall breadth of his mathematical contributions.
The fact that Laplace and Gauss worked on both direct and inverse probability was unusual from some perspectives, given the philosophical divide between frequentist and Bayesian practitioners even at the start of the twenty-first century. Later, both would gravitate toward frequentist approaches for minimum variance estimation, which is seen by some as a criticism of inverse probability. Other mathematicians, such as Poisson and Antoine Cournot, criticized inverse methods, while Robert Ellis and John Venn proposed defining probability as the limit of the relative frequency in an indefinite series of independent trials—essentially, the frequentist approach. The maximum likelihood estimation method proposed by Ronald Fisher in the early twentieth century was interpreted by some as melding aspects of frequentist and inverse methods, though he adamantly denied the notion, saying, “The theory of inverse probability is founded upon an error, and must be wholly rejected.” This may explain the essential absence of inverse or Bayesian probability concepts in the body of early statistical inferential methods, which were heavily influenced by Fisher.
Mathematician and anthropometry pioneer Adolphe Quetelet brought the concept of the normal distribution of error terms into the analysis of social data in the early nineteenth century, while others like Francis Galton advanced the development of the normal distribution in biological and social science applications in the latter half of the same century. Many mathematicians, statisticians, scientists, and others have contributed to the development of probability theories, far too many to exhaustively list, though recognized probability distributions are named for many of them, such as Augustin Cauchy, Ludwig von Mises, Waloddi Weibul, and John Wishart. Pafnuty Chebyshev, considered by many to be a founder of Russian mathematics, proved the important principle of convergence in probability, also called the Weak Law of Large Numbers. Andrei Markov’s work on stochastic processes and Markov chains would lead to a broad range of probabilistic modeling techniques and assist with the resurgence of Bayesian methods in the twentieth century.
Some historians have suggested that one difficulty in developing a comprehensive mathematical theory of probability, despite such a long history and so many broad contributions, was difficulty agreeing upon one definition of probability. For example, noted economist John Keynes asserted that probabilities were a subjective value or “degree of rational belief” between complete truth and falsity. In the first half of the twentieth century, mathematician Andrey Kolmogorov outlined the axiomatic approach that formed the basis for much of subsequent mathematical theory and development. Later, Cox’s theorem, named for physicist Richard Cox, would assert that any measure of belief is isomorphic to a probability measure under certain assumptions. It is used as a justification for subjectivist interpretations of probability theory, such as Bayesian methods. There are variations or extensions on probability with many applications. Shannon entropy, named for mathematician and information theorist Claude Shannon and drawn in part from thermodynamics, is used in the lossless compression of data. Martingale stochastic (random) processes, introduced by mathematicians such as Paul Lévy, recall the kinds of betting problems that challenged de Méré and inspired the development of probability theory. Chaos theories, investigated by mathematicians including Kolmogorov and Henri Poincaré, sometimes offer alternative explanations for seemingly probabilistic phenomena. Fuzzy logic, derived from mathematician and computer scientist Lotfali Zadeh’s fuzzy sets, has been referred to as “probability in disguise” by Zadeh himself. He has proposed that theories of probability in the age of computers should move away from the binary logic of “true” and “false” toward more flexible, perceptual degrees of certainty that more closely match human thinking.
Bibliography
Devlin, Keith. The Unfinished Game: Pascal, Fermat, and the Seventeenth-Century Letter That Made the World Modern. New York: Basic Books, 2008.
Gigerenzer, Gerd. The Empire of Chance: How Probability Changed Science and Everyday Life. Cambridge, England: Cambridge University Press, 1990.
Hacking, Ian. The Emergence of Probability: A Philosophical Study of Early Ideas About Probability, Induction and Statistical Inference. Cambridge, England: Cambridge University Press, 2006.
———. An Introduction to Probability and Inductive Logic. Cambridge, England: Cambridge University Press, 2001.
Hald, Anders. A History of Probability and Statistics and Their Applications Before 1750. Hoboken, NJ: Wiley-Interscience, 2003.