Error Analysis

Type of physical science: Mathematical methods

Field of study: Probability and statistics

Error analysis is the quantitative study of the uncertainty and discrepancies arising in measurement, estimation, and numerical computation. Errors are most widely, though not exclusively, understood as random or stochastic errors, which can be modeled by a number of probability distributions.

Overview

The magnitudes of most physically measurable quantities--such as time, length, velocity, temperature, and mass--are always in some ways and to some extent inaccurate, and only rarely even in principle can absolutely true and correct values be determined. It is usually justified to assume that an accurate true value exists, even if it is only approachable in the limit of many repeated measurements. The task of error analysis is to estimate the extent and rate of which the true value is approached by the measured value, and the measured value's uncertainties. "Error" is the quantity by which a measurement or approximate calculation differs from the most accurate possible determination. It is important to distinguish true observational errors from simple mistakes, such as taking readings before stable equilibrium conditions have been achieved in a sample and misreading or misrecording a correctly registered result. All errors not the result of human oversight are the subject of error analysis.

Differences between an observed quantity and the true value are called "observational" errors. Observational errors have many causes and are classified into "systematic" and "random" errors. Systematic errors have a specific identifiable cause the magnitude and arithmetic sign of which typically do not vary from measurement to measurement. An example of a systematic error would be making continual (mis)readings from a measuring instrument improperly calibrated. Many scientists also recognize instrumental versus natural systematic errors. Instrumental errors are those resulting from imperfections in the construction, adjustment, and access of a measuring instrument, such as distorted scales or defective optics. Natural systematic errors are those from unaccounted-for changes in the surrounding ambient conditions, such as temperature and electric field strengths. Numerous episodes in the history of the physical and natural sciences and engineering, such as measurements of fundamental physical constants and regional geophysical field data, have shown that systematic errors are the most serious and difficult-to-trace forms of experimental inaccuracy. Although some systematic errors can be detected by comparing measurements made by different equivalent techniques or observers, there is no general method to identify or prevent systematic error.

Random error is usually caused by transient fluctuations and perturbations in the measurement procedures and/or the data values themselves, the causes of which are usually large in number and have complex and irregular interactions. They are called "random" since the error arithmetic sign and value usually change from measure to measure by a process indistinguishable from pure chance. Common examples of random errors would be unwanted deviations in measurements resulting from noiselike processes, such as Brownian motion of dust particles colliding in the air. Given that measurements are sufficiently precise and numerous, most random errors are revealed by standing out from the average trend or most frequent data behaviors. In many cases, random errors for a particular measurement would ideally disappear (average out) in the limit of a very large or infinite number of measurements under identical conditions. "Precision" refers to the accuracy with which given measurements or computations are made, regardless of whether the measurements are measures of the correct value. The degree to which true values are approached in measurement or computation is designated "accuracy."

If a number of measurements have been made of a specific quantity having true value x, the "arithmetic mean" X of the measurements is the sum of the measurements divided by the total number of measurements. If the error associated with each measurement represents truly random errors, the larger the number of measurements, the smaller in general the net or average error value. This is the underlying reason why the arithmetic mean is often considered the statistically best estimator. The difference between a measured value x and the arithmetic mean X of such measurements is called the residual error.

In many composite calculations in physics and engineering, in addition to knowing the error in a single measurement, it is necessary to know how this error spreads to estimates of other measured or computed quantities. The errors of sums, products, and quotients can usually be simply expressed. However, for more complicated mathematical expressions involving many variables having associated errors, it is necessary to employ the calculus of errors. If x is the originally measured quantity, and y is the quantity calculated from x, with Δx the error in x, the associated error Δy in y is defined the derivative, or rate of change of error in y with respect to x, mathematically expressed as limit (Δy/Δx)Δy/Δx. In many cases, if Δx is sufficiently small compared to x, the first-order approximation for δy is (Δy/Δx)δx. If some parameter P(x,y,z) is a function of several measured quantities x, y, and z, the error in P caused by each of the errors Δx, Δy, and Δz, ΔP, is defined by the differential calculus of errors as ΔP = (δP/δx)Δx + (δP/δy)Δy + (δP/δz)Δz. Because each term comprises only the error caused by a single independent quantity, this general formula is frequently designated the superposition principle of (independent) errors. A conclusion of error calculus is that independent errors, expressed as standard deviations delta, are quadratically additive, so that σtotal² = σx² + σv² + σz².

For net errors in repeated readings of a particular measurement, the random error, as defined by the arithmetic mean of several readings, is usually much less than the random error in any individual reading. It is thus necessary to estimate the intrinsic degree of dispersion of the errors in individual measurements and groups of measurements. The simplest estimate is the range, the difference between the minimum and maximum values of the variable. More frequently used is the square of the residual, known as the variance, defined mathematically as v = [(1/n) Σ (x - X)2], where the Greek upper case letter Σ denotes the sum of all terms from 1 to n. In many applications, the square root of the variance, known as the root-mean-square deviation, is employed to measure how, on average, a group of n observations deviate from their own mean. When the root-mean-square deviation of a given measurement sample refers to a very large number of observations, this enhanced estimate root-mean-square deviation is known as the standard deviation, denoted by the lower-case Greek letter delta. Because in general it cannot be assumed that the sample arithmetic mean of a finite sample set equals the true mean of an infinite set, formulas exist for practical estimates of standard deviation, using the finite sample set or "working" mean. From these formulas it can be shown that the sum of the squares of the deviations of any given data set is least when its own working mean equals the true mean of the data.

Further relations in the theory of errors arise from efforts to estimate error measurements in the limit as the total number of measurements approaches infinity. These extensions require use of the theory of probability and other mathematics. The notion of probability is underscored here by the simple example that, although it is legitimate to say that the probability of obtaining heads in four consecutive coin tosses is one in four, this does not guarantee that in only four trials this result would be definitely obtained, but rather that in the long run, in the course of a very large number of such trials, this ratio would be approached. Thus, if ten repeat measurements are made of an on/off circuit measuring whether a radioactive particle decay event has or has not occurred, the probability of getting n "ons" and - n) "offs" will be some fraction for each value of n, an integer between 0 and 10. This fraction, a function f(n) of n, is known as the probability distribution, and the value of f(n) associated with a specific number of measurements n is called the probability density or probability density function of n. Both probabilities are subject to the normalization rule that the sum of all probabilities must be unity .

Applications

Probability representations for the number, kind, degree, and specific behaviors of errors help to specify the reliability of quantitative measurements in most areas of the physical, engineering, and natural sciences. Many methods are used to display different features of measured data, notably their degree of randomness and associated error behaviors. Where a series of rectangles are constructed having width equal to the size increment of the specific class of measurements (for example, pebbles of sizes between 0.2 and 0.3 centimeter in diameter), whose rectangular area is equal to the frequency, or total number of members in each class, such a plot is called a histogram. There are usually three of many possible equations commonly used to fit actual data histograms. These are the Gaussian or normal, Poisson, and binomial probability distributions. The equation for the normal ("bell-shaped") Gaussian error curve is y = A exp [-h2(x - X)2], where exp is the exponential function, X is the arithmetic mean, and x is the value of any single data observation. A is a constant representing the maximum possible value of the curve (occurring at x = X), and h is a parameter governing the curve's width or narrowness, inversely proportional to the standard deviation squared (formally, h2 = 1(2 δ2)). A Gaussian histogram indicates that the probability that the true value of an observation lies between x and x + Δx is equal to the fractional area under that portion of the Gaussian bell curve. Evaluating this definite integral can be accomplished by using Gaussian error, or "erf," functions tabulated in many physical science and mathematics reference handbooks. Common applications of the Gaussian distribution are in computing standard errors and in expressing confidence limits. This refers to the fact that scientific measurement and Δx the error estimate. Δx is often taken as the standard deviation, twice the standard deviation, or even σ/(√n). If, for example, in 100 +/- 10, 10 represents one standard deviation delta, this implies that 68.27 percent of repeated measurements of the same specimen should lie within 1delta of 100, that is in the range 90 to 110; if 2delta, 95.45 percent should lie between 80 and 120, always assuming that the underlying pdf is in fact Gaussian. The "+/-" factor is known as the confidence interval, expressing the level of statistical confidence that the true value actually lies within this +/- interval.

Wherever it can be assumed that random errors are distributed normally about a mean value X, the requirement that the sum of the squares of these errors, Σ (x - X)², be a minimum is termed the least squares error criterion. Least squares is best known through finding the best fit or regression between two variables, x and y, which have a (usually assumed linear) functional relationship. Many scientists and engineers, as well as students, are familiar with "eyeballing" trends on X=Y plots. Regression, however, does the fitting objectively and tests the statistical significance of the results. The latter is an important point, for if two variables (for example, the number of storks and the number of reported childbirths in Brussels, Belgium) are in reality unconnected, the apparent linear fit or regression is meaningless or even deceptive. Classical regression examines the variation of one dependent (output, response) variable X, subject to error, with another independent (input, explanatory) variable Y, in most cases theoretically not subject to error. Valid regression examples would include the specific gravity of an igneous mineral on its lead content, or the size of an individual child versus his age. The traditional technique for estimating a regression line involves minimizing the squares of the distances of the points from this line. For example, to determine statistically the best values for m and c, assuming in the linear case a governing equation form y = m x + c, the optimum values are found by taking the partial derivatives for the particular function y = f(x) with respect to m and f and setting these derivatives equal to 0. It can be shown that errors about the regression line are normally (Gauss) distributed with zero mean and variance of delta to the power of 2. Furthermore, it is important to determine whether the differences between the observed and predicted values of Y (the residuals), also known as the spread of Y values about the regression line, increase, decrease, or vary with varying values of X. In some situations, there is no apparent "dependent" or "independent" variable, and so-called structural regression is required when both variables (X and Y) are subject to error. The predominance of error in X, Y, or both can be verified by successively regressing X against Y and then Y against X and comparing the resulting slope and intersect of the respective curves.

Context

The earliest studies of probability, largely in connection with gambling, led to development of combinatorics and permutations. Error analysis originally arose as part of experimental physics of Pierre-Simon Laplace (1749-1827), Carl Friedrich Gauss (1777-1855), and Simeon-Denis Poisson (1781-1840), and only gradually as part of statistics through the efforts of Lambert-Adolphe-Jacques Quetelet (17961874) and notably Karl Pearson (1857-1936). The latter in particular were groundbreakers in defining and applying the major varieties of probability distributions which bear their names, in mechanics and thermodynamics, hydrology, geology, and other sciences. The importance of so-called correlated or interrelated variations was dramatically emphasized by Charles Darwin (1809-1882) and Francis Galton (18221911) in biology and genetics, and more generally by Charles Spearman (1863-1945) and George Yule (1871-1951). In this way, the theory of error has been the main link between the theories of statistics and of probability. By the late 1890's, the basic theories for treating random errors by least squares and probability distributions had been developed and presented in textbooks in a form essentially unchanged to the present.

The nearly ubiquitous Gaussian normal error law, although computationally much more convenient and simpler to interpret than most other distributions, was increasingly realized as of limited application. As one common example, circa 1890 Pearson first showed that if a large set of data measurements are not all equally reliable, their residual errors cannot be expected to follow the histogram of a normal Gaussian curve. Higher-order measures or moments, to measure the type and extent of asymmetry, or skewness, were also developed by Pearson and Vilfredo Pareto (18481923), among others.

Another major development in the theory of errors occurred in the work of Norbert Wiener (1894-1964) and Claude Shannon (born 1916) for the more general treatment of random or stochastic noiselike sequences and processes. These include the modeling, estimating, and removing of noise from radar, sonar, seismic, and other time-series signals. A time series is any sequence of data in which time is one of the variables (for examples, the number of earthquakes over a number of years, or the amplitude of a seismic trace versus seconds after an earthquake occurrence). A closely related subsequent effort is that of formal estimation theory, which further develops the above methods to estimate statistically the true values of parameters governing the output response of linear and nonlinear systems, in the presence of ambient and intrinsic noise.

Principal terms:

ACCURACY: the intrinsic objective truth of a measurement or calculation, or the degree to which calculations or measurements approach true values

MODEL: a conceptual representation, in mathematical form, by which one or more behaviors or properties of a given data set can be organized, simulated, and predicted

PRECISION: the accuracy with which a given measurement or calculation is made, regardless of whether the measured datum is a true and accurate representation

PROBABILITY DISTRIBUTION FUNCTION: a real-valued function which gives the probability that a particular random variable has a given range of values

RANDOM ERRORS: discrepancies or unwanted variations resulting from unknown causes, frequently large in number and very complex and irregular in behavior

REGRESSION: the procedure by which behavior of one variable is tested against that of another, usually by plotting them in an x-y coordinate system

ROUND-OFF ERROR: the error, introduced in data analysis and computations with data, resulting from the limited arithmetic precision (number of decimal places) to which numbers can be represented

SIGNIFICANT FIGURES: only those arithmetic digits, from measurement and computation, conveying actually meaningful (precise) information

SYSTEMATIC ERRORS: those errors resulting from a definite and, in principle, identifiable cause, usually cumulative yet often illusive, since repeated observation does not necessarily reveal or reduce systematic errors

TRUNCATION ERROR: the error introduced as a result of the limits in the number of digits available for any given finite arithmetic computation

Bibliography

Barford, N. C. EXPERIMENTAL MEASUREMENTS: PRECISION, ERROR, AND TRUTH. 2d ed. New York: Wiley, 1985. This book is one of the better general conceptual-operational introductions to the concepts and methods of error analysis in the context of experimental measurements in the natural sciences. Assuming no prerequisites other than high school and first-year-college mathematics, Barford provides additional historical and philosophical background.

Barry, B. A. ERRORS IN PRACTICAL MEASUREMENTS IN SCIENCE, ENGINEERING, AND TECHNOLOGY. New York: Wiley, 1978. Barry gives a somewhat more advanced, but still accessible (to college undergraduates) discussion of specific empirical sources of error, particular statistical formulas for their treatment, and more advanced topics such as testing the accuracy of statistical error estimates. Brings together much reference documentation.

Bevington, Philip R. DATA REDUCTION AND ERROR ANALYSIS FOR THE PHYSICAL SCIENCES. New York: McGraw-Hill, 1969. Bevington's is a widely used standard reference for physics, engineering, and chemistry students, with many worked examples. Discussing and deriving in full detail all topics of this article, Bevington gives several good accounts for using error analysis in the planning as well as the analysis of experimental results.

Cacko, Jozef, Matej Bily, and Juraj Bukoveczky. RANDOM PROCESSES: MEASUREMENT, ANALYSIS, AND SIMULATION. New York: Elsevier, 1987. Supplements the traditional emphasis on manual calculations with use of computer and personal computer algorithms. The amount of statistical data analysis software available is almost overwhelming; this book gives some overview of generic and specific programs. Presents material complementing, but in some specifics differing in emphasis and interpretation from, Davenport (cited below).

Davenport, Wilbur B. PROBABILITY AND RANDOM PROCESSES: AN INTRODUCTION FOR APPLIED SCIENTISTS AND ENGINEERS. New York: McGraw-Hill, 1970. A classic treatment, matching mathematical treatment with a practical approach stressing intuition from experience and worked examples. Includes discussions concerning the law of large numbers, chi-squared tests, and other advanced topics, primarily from the electrical and operations engineering perspective. In many places, Davenport offers detailed examples with comparative observations on the appropriate selection and usefulness of common statistical tests.

Novak, Erich. DETERMINISTIC AND STOCHASTIC ERROR BOUNDS IN NUMERICAL ANALYSIS. New York: Springer-Verlag, 1988. Novak defines the errors frequently occurring in a wide class of manual and computer-implemented mathematical calculations, starting from basic round-off and truncation errors to numerical stability analysis. Underscores in practical as well as theoretical fashion the distinctions between error processes or causes that are, and are not, treatable via models of known physical phenomena.

Patil, Ganapati P., ed. RANDOM COUNTS IN SCIENTIFIC WORK. 3 vols. University Park: Pennsylvania State University Press, 1970. Patil's trilogy offers one of the most comprehensive compendia of principles and examples of contemporary error analysis, covering almost the entire gamut of science, engineering, medical, business, and social science applications. Attempts to guide the reader to the most appropriate particular technique and sufficient background for a wide selection of specific applications.

Sorenson, J. PARAMETER ESTIMATION. New York: Marcel Dekker, 1980. Applications to engineering sciences are comprehensively stressed. Gives many complete derivations not readily found elsewhere and connects the subject of statistical error analysis with that of parameter estimation in signal processing and inversion. A good balance between an engineering-science and mathematical-statistical treatment and orientation.

Taylor, J. R. ERROR ANALYSIS: THE STUDY OF UNCERTAINTIES IN PHYSICAL MEASUREMENTS. Mill Valley, Calif.: University Science Books, 1982. Further develops and extends many of the traditional error analysis concepts presented in works cited above. Includes specific and anecdotal reports and gives many more examples of the kinds of systematic error, as well as an outline of the applicability of parametric versus nonparametric statistics, particularly as these arise in classical physics and material science.

Van der Ziel, Aldert. NOISE IN MEASUREMENTS. New York: Wiley, 1976. A good thematic introduction to the varieties, forms, and treatment of noise-type errors in a wide spectrum of physical measurements. Includes discussion of the variety and definitions of noise as unwanted, interfering, spurious, or otherwise erroneous signals. Colored, shot, Johnson, Brownian, and other specific noise models are considered.

Essay by Gerardo G. Tango