Exploring Mean and Median
"Exploring Mean and Median" delves into the fundamental statistical concepts of central tendency, specifically focusing on the mean, median, and mode. These measures provide insight into the heart of a dataset by indicating where the central value lies. The mean, often referred to as the average, is particularly useful for normally distributed data, while the median serves as a critical alternative when data is skewed or not evenly distributed. The mode, representing the most frequently occurring value, is the simplest measure to grasp.
Understanding when to use mean, median, or mode is essential, as each can yield different insights based on the data's distribution. The central limit theorem further enriches these concepts, illustrating that the sample mean will tend towards a normal distribution as the sample size increases, even if the original data is not normally distributed. This principle is especially relevant in practical applications, such as in medical research and data analysis. By selecting the appropriate measure of central tendency, one can more accurately interpret and draw conclusions from data across various fields.
On this Page
Subject Terms
Exploring Mean and Median
The mean, median, and mode are key tools in statistics that go to the heart of the fundamentals in statistics. Together they comprise the concept named "central tendency". Here, these three measures of central tendency are measures that tell us just where the middle of a dataset lies. The values found close to the center (mean and median) can be considered as representative of the central value.
The arithmetic mean, commonly called an average, is a very popular measure of central value. Care needs to be taken when choosing which of the three (mean, median or mode) is to be used, however. The choice is based on the spread of the data under examination. If this is fundamentally Gaussian: the mean is the tool of choice: If x1, x2, …, xn are n observations under study, the sample mean is
. Otherwise, use the median or mode. The median is the middle value, which is obtained as the
th value if n is odd, after arranging in ascending order, and average of
th and
th if n is even. The mode is the most common value.
The mode is the simplest concept to understand. It lies at the point at which there is the highest number of values. The mean is the pivotal point on either side of which the data values are balanced. With the median, half of the data values will lie at a point higher than the median. This may or may not happen with the mean.
Another fundamental concept to do with the mean is named the central limit theorem. Conventional wisdom has it that in almost any situation arising in medical practice, the distribution pattern of the sample mean tends to become normal when n (the number making up the sample) is large even when the underlying distribution among individuals is clearly not normal. This is a very useful property, especially when drawing inferences about the data. In most situations, n is deemed to be large when n is greater than or equal to 30. Exceptions to the central limit theorem exist but they are rare and can generally be ignored. Gaussian conditions also exist for small n if the underlying distribution is roughly Gaussian. So the only non-Gaussian condition for the mean is that n is small and the underlying distribution is far from Gaussian, leading to different inferential methods to those used when Gaussian conditions are met.
Similar thinking arises when dealing with probabilities. It is possible to have a sampling distribution of the probability of interest, and for this probability to be subject to the central limit theorem.
Bibliography
Forbes, Catherine, Merran Evans, Nicholas Hastings, and Brian Peacock. Statistical Distributions. Hoboken, NJ: Wiley, 2011.
Freedman, David, Robert Pisani, and Roger Purves. Statistics. 4th ed. London: Norton, 2011.
Gelman, Andrew. "Sample Size and Power Calculations." Columbia University, n.d. Web. 27 Oct. 2014.
Glantz, Stanton A. Primer of Biostatistics. 7th ed. New York,: McGraw, 2011.
Moore, David, and William I. Notz. Statistics: Concepts and Controversies. 8th ed. New York: Freeman, 2012.