Density

Density is a vital concept in statistics, leading as it does to an understanding upon which most statistical analysis depends. In any given sample, the set of possible values is infinite, and the probability of finding any exact value is zero. The probability of the random variable must therefore assume values between two chosen limits, rather than at particular values.

Overview

The proportion of individuals in a given population whose values are between two chosen limits is p. An individual chosen at random is the random variable. The probability of choosing an individual who lies between the two limits is equal to p. It is necessary to give a value to this proportion. Develop a frequency distribution and derive the relative frequency and thus the relative frequency density. This will give the probability distribution of the variable, which is called the probability density function.

Figure 1 is a histogram based on a set of data recording the height of a number of individuals in a population. The heights of the rectangles represent the frequency. Naturally, one determinant of this is the shape of the underlying distribution, others being the sample size and the size of the intervals in use in the histogram. To work with this, it is necessary to develop the concept of relative, or proportional frequency. This is achieved by considering the proportion of the sample falling within each interval, rather than the number of individuals falling within each interval. This does away with any dependency on the size of the sample. The heights of the rectangles represent the proportion of observations falling within certain limits, and these heights depend only on the shape of the underlying distribution and the size of the intervals.

98418279-96993.jpg

It is possible to adjust for this. This involves a move to relative frequency density, the proportion of observations in the interval per unit of X. When the interval size is 10, the relative frequency density is the relative frequency divided by 10. With larger samples, smaller intervals can be taken, leading to a close approximation of a smooth curve, giving the probability density function. Statistics calculated from the data become independent of the distribution of the observations themselves and follow the normal distribution. Most statistical analysis rests on this.

Bibliography

Anderson, David R., and Dennis J. Sweeney. Essentials of Statistics for Business and Economics. Stamford, CT: Cengage, 2015.

Durrett, Rick. Elementary Probability for Applications. Cambridge: Cambridge UP, 2009.

Petersen, Alexander, Chao Zhang, and Piotr Kokoszka. "Modeling Probability Density Functions as Data Objects." Economic & Statistics, vol. 21, Jan. 2022, pp. 159-178, doi.org/10.1016/j.ecosta.2021.04.004. Accessed 18 Nov. 2024.

Ross, Sheldon. A First Course in Probability. Harlow, UK: Pearson, 2012.