Theoretical Statistics

Statistics allow one to organize and interpret data that would otherwise be incomprehensible. However, statistics is much more than a set of mathematical techniques that are used to manipulate data in order to derive an answer. For statistics to be truly useful, one must recognize and understand the fact that there is an underlying uncertainty and variability in data and collections of data. Analyzing and interpreting data using statistics is a messy process and sampling error, measurement error, and estimation error can negatively impact the results. In addition, not every statistical technique is appropriate for use in every situation. The researcher needs to be careful to pick the correct technique to match the characteristics of the data being analyzed. Statistics do not yield exact results, but only probabilities. Although not an exact science -- or at least not a science of exact results -- if one understands the theoretical underpinnings, statistics can be of immeasurable help in understanding the phenomena of the real world.

Taking a statistics course can be an intimidating experience for many people. Perhaps the reason is that they fear the precision required to accurately calculate answers or perhaps it is because a long list of arithmetic procedures looks like an overwhelming amount of effort to obtain a single numerical answer. Perhaps, however, the real problem lies with the front and back ends of the statistical process: Determining which analytical technique to use and knowing how to properly interpret the end result of the calculations. The options can be confusing. How does one properly design an experiment to adequately test a hypothesis so that all the important variables are considered? What statistical analysis technique is appropriate to evaluating the data? Is the test one-tailed or two-tailed? What is the confidence level for the results? These and other questions can plague beginning students and professionals alike in their search for making sense out of data and drawing real world conclusions. The trick to understanding statistics is to understand the theory and principles underlying them. Statistics is much more than a set of mathematical techniques that are used to manipulate data in order to derive an answer. For statistics to be truly useful, one must recognize and understand the fact that there is an underlying uncertainty and variability in data and collections of data.

As human beings, most of us try to move from a position of uncertainty to one of certainty. Knowing "truth" is comforting, and can help us make decisions and plan for the future. However, life does not work that way and statistics does not work that way, either. Rather, statistics suggest with various degrees of confidence (or lack thereof) that one interpretation of the results is more likely than the other. Statistics do not yield black-and-white answers: They give best guesses or scientific estimates.

Statistical Error

Sampling Error

In reality, analyzing and interpreting data using statistics is a messy process. Error is a fact of life when dealing with real world problems. People and things do not always act the way that we expect them to. This happens for a number of reasons. First, for most situations it is virtually impossible to gather data on every member of a population -- the entire group of subjects belonging to a certain category. For example, if one wanted to know what features people in the United States would like in a new widget, it would be virtually impossible to ask each individual: There are simply too many people for this to be a reasonable and cost-effective task. In addition, some people may be out of the country or otherwise unavailable for comment. Therefore, data are usually collected on a sample -- a subset of the population that is assumed to be representative of the population. Sometimes a random sample is used that is chosen at random from the larger population with the assumption that such samples tend to reflect the characteristics of the larger population. The problem with the assumption, however, is that it is impossible to tell whether or not the sample is truly representative without looking at the characteristics of the population. As a result, sampling error -- an error that occurs in statistical analysis when the sample does not represent the population -- can occur and throw off the results of the research. Further, some people may just lie to the researcher for any of a number of reasons ranging from not understanding the question to not paying attention to deliberately trying to throw off the results, again compounding the possibility of error.

Measurement Error

Another type of error that can occur when using statistics to analyze data is measurement error. This is the portion of the observed score that is random noise and not part of the measurement. For example, the researcher may sometimes read the level in a beaker slightly high at times and slightly low at other times. In most cases, measurement errors tend to cancel each other out with repeated measurements (i.e., an accidentally inflated value on one measurement is compensated for by an accidentally deflated value on another measurement). However, there are systematic measurement errors that can be introduced into the situation that do not cancel each other out. For example, the way a question is asked on a questionnaire or in an interview may be misunderstood and an accurate response may not be obtained. Similarly, sometimes researchers unconsciously bias the results by their expectations or by the way that they collect the data. For example, if Harvey likes to flirt with women, he may end up with much better data collected from the women that he interviews than from the cursory discussions he has with the men that he samples.

Estimation Error

Estimation error is the error introduced by statistical estimates. This type of error can come from several sources. Although it is assumed that there is a real or "true" value underlying the numbers that are used in statistical calculations, one almost never gets to work with these true values. For example, most people remember the value of p as 3.14, 3.141, 3.1416, or some other finite value. However, p is actually an infinite decimal that cannot be computed exactly either by finite human beings or their finite computers. How one rounds p or any long decimal number is not of particular importance to the outcome of the calculations in which it is used in simple situations. However, in other instances, it can be and can throw off the entire calculation with the rounding error magnified by subsequent computations. Each number is merely an approximation and not a true value. If they are rounded properly, the errors may not aggregate. However, if the numbers used in calculations are systematically rounded up, for example, the resultant error will compound.

Choosing the Proper Statistical Technique

In addition, not every statistical technique is appropriate for use in every situation. Sometimes it is obvious which statistical analysis method to use: If one wants to better understand how two variables are related, one would calculate a correlation coefficient rather than an analysis of variance. In other situations, however, the choice of which statistical technique is most appropriate is not so obvious. Some techniques, for example, assume that the samples being analyzed are not dependent whereas other techniques do not make this assumption. Sometimes the student or researcher may pick a technique not because it is the most appropriate technique to analyze the data, but it because it is the most appropriate technique with which s/he is familiar or comfortable. For example, although some complex situations with multiple variables are most appropriately analyzed using multivariate techniques such as multivariate analysis of variance (MANOVA), the complexity of the technique as well as the interpretation of the results make it tempting to some to use multiple simple techniques (e.g., t-tests) to test pairs of variables individually with the hope that something will be significant. In many cases, this will in fact be true. However, this "significance" will not be reflective of underlying differences but rather an artifact of testing the data too many times. The more tests are run on a single set of data, the more probable it is that spuriously significant results will occur merely by chance. This approach is often referred to as "shotgunning." Conclusions drawn on such analyses are suspect at best. There are a wide variety of statistical techniques available for data analysis. The researcher needs to be careful to pick the correct one to match the characteristics of the data being analyzed.

For example, there are a number of statistical techniques from which to choose if one wants to determine the difference between the means of different groups on some characteristic. Differences between two measures of one group or one measure for two independent groups often call for the use of t-tests. However, as discussed above, as the number of measures or groups increases, one choose a statistical technique designed for that situation rather than sequentially running t-tests until all the variables are tested. For these situations, analysis of variance is typically used. This family of techniques analyzes the joint and separate effects of multiple independent variables on a single dependent variable and determines the statistical significance of the effect. The questions asked are similar to the questions answered by t-tests: Are there differences in the means between and among the various groups? However, analysis of variance avoids the problem of shotgunning by simultaneously analyzing all the groups. In many ways, analysis of variance is an extension of the t-test for use in complex situations. If, however, there are multiple measures taken on multiple groups, the multivariate analysis of variance is more appropriate. This technique tends to be complex to both compute and interpret, however, so the tendency of many people is to avoid it in favor of running multiple analyses of variance. However, as with the case of running multiple t-tests instead of an analysis of variance, this is a case of shotgunning and can compound the error inherent in the data and lead to spurious results.

These are just a few examples of the underlying principles and theory of statistics. Statistics allow one to organize and interpret data that otherwise would be incomprehensible. However, this order is purchased at the cost of certainty. Statistics do not yield exact results, but only probabilities. Although not an exact science -- or at least not a science of exact results -- if one understands the theoretical underpinnings, statistics can be of immeasurable help in understanding the phenomena of the real world.

Applications

Statistical Limitations

Statistics are invaluable in helping one better interpret data, trends, and other empirical evidence. However, they are not without limitations. Statistics do not yield black and white answers, but merely describe trends and probabilities. To interpret statistics in a meaningful way, one must understand these limitations.

Measures of Central Tendency

It is tempting at times for people who do not understand the underlying theory of statistics to look at descriptive statistics and "eyeball" them to generate an answer to a question. This method, however, is of little practical use. For example, measures of central tendency can be easily misinterpreted. In addition, they are each more useful in particular situations. The use of the wrong statistic can seriously misguide those using it. The various measures of central tendency are based on different calculation methods and are not equivalent to each other nor can they be interchanged in most circumstances. The median is pulled in the direction of the skew (i.e., toward the end of the distribution where there are the most data points). This fact means that the median disproportionately reflects any data points that are at the extreme end of the distribution. If the extreme ends are balanced (i.e., not skewed), the median is unaffected. However, if these ends of the distribution are not balanced, the median is pulled in the direction of the outlying data. The mean is also affected by extreme scores. For example, if an average salary reported were the mode and most of the people in that occupation only made $20,000 per year it would be quite different than if the statistic reported were the mean, which is pulled in the direction of the skew. Figure 1 shows an example of how the various measures of central tendency are affected by the shape of the underlying distribution. The average salary in this example is much closer to the mode than it is to the mean because of the small proportion of people who make much more than the rest.

ors-bus-442-126391.jpg

Misinterpretation of Data

Misinterpretation of data based on the misunderstanding of the nature of statistics occurs frequently. For example, every time one opens the newspaper, graphs, statistics, and interpretations leap off the page. An advertisement for a book may state that the authors have a combined experience of 50 years in the field. However, if there are five authors who have 10 years of experience each, does that mean that the book is more worthwhile than a book that was written by one person with 40 years' experience? The book written by the individual with 40 years' continuing experience may have better insights than a book written by several people each of whom only has a little experience. The statistic cited does not reflect this difference.

Inferential Statistics

Inferential statistics are also liable to misinterpretation. For example, the coefficient of correlation is frequently misinterpreted. This statistic is used to articulate the degree to which high values of one variable are associated with high (or low, in the case of negative correlation) values in the other. For example, for most cases one could truthfully say that weight gain in the first year of life is positively correlated with age (i.e., the older the baby is, the more likely it is to weigh). However, this strong, positive correlation does not apply to adults (i.e., heavier adults are not necessarily older than lighter adults). Correlation only shows the relationship between the two variables, however: It does not explain why the relationship occurs or what caused it. A classic example of how this statistic can be misused examines the correlation between the number of storks seen nesting in the chimneys and the number of births in the villages of a northern European country. The strong, positive correlation could lead someone to interpret the correlation and conclude from this evidence that storks bring babies. The truth, however, is that the human parents had been enjoying the warm summer months, which meant the deliveries were un-relatedly timed with the appearance of the storks happily nesting over the warm chimneys in the spring. The correlation, therefore, was spurious. Correlation does not indicate causation. One must be similarly cautions in the interpretation of all inferential statistics.

Terms & Concepts

Analysis of Variance (ANOVA): family of statistical techniques that analyze the joint and separate effects of multiple independent variables on a single dependent variable and determine the statistical significance of the effect.

Correlation: The degree to which two events or variables are consistently related. Correlation may be positive (i.e., as the value of one variable increases the value of the other variable increases), negative (i.e., as the value of one variable increases the value of the other variable decreases), or zero (i.e., the values of the two variables are unrelated). Correlation does not imply causation.

Data: (sing. datum) In statistics, data are quantifiable observations or measurements that are used as the basis of scientific research.

Descriptive Statistics: A subset of mathematical statistics that describes and summarizes data. Descriptive statistics include graphing techniques, measures of central tendency (i.e., mean, median, and mode), and measures of variability (e.g., range, standard deviation).

Hypothesis: An empirically-testable declaration that certain variables and their corresponding measure are related in a specific way proposed by a theory.

Inferential Statistics: A subset of mathematical statistics used in the analysis and interpretation of data. Inferential statistics are used to make inferences such as drawing conclusions about a population from a sample and in decision making.

Population: The entire group of subjects belonging to a certain category (e.g., all women between the ages of 18 and 27; all dry cleaning businesses; all college students).

Probability: A branch of mathematics that deals with estimating the likelihood of an event occurring. Probability is expressed as a value between 0 and 1.0, which is the mathematical expression of the number of actual occurrences to the number of possible occurrences of the event. A probability of 0 signifies that there is no chance that the event will occur and 1.0 signifies that the event is certain to occur.

Sample: A subset of a population. A random sample is a sample that is chosen at random from the larger population with the assumption that such samples tend to reflect the characteristics of the larger population.

Sampling Error: An error that occurs in statistical analysis when the sample does not represent the population.

Statistics: A branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provides the theoretical underpinnings for various applied statistical disciplines, including business statistics, in which data are analyzed to find answers to quantifiable questions. Applied statistics uses these techniques to solve real world problems.

Variable: An object in a research study that can have more than one value. Independent variables are stimuli that are manipulated in order to determine their effect on the dependent variables (response). Extraneous variables are variables that affect the response but that are not related to the question under investigation in the study.

Bibliography

Acosta, F. M. A. (2000). Hints for the improvement of quality teaching in introductory engineering statistics courses. European Journal of Engineering Education, 25(3), 263-280. Retrieved August 17, 2007, from EBSCO Online Database Academic Search Complete. http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=3837925&site=ehost-live

Browne, R. H. (2010). The t-test p value and its relationship to the effect size and P( X> Y). American Statistician, 64(1), 30-33. Retrieved October 31, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=52341455&site=ehost-live

East, R., Uncles, M. D., Romaniuk, J., & Hand, C. (2013). Distortion in retrospective measures of word of mouth. International Journal Of Market Research, 55(4), 2-9. Retrieved October 31, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=89072022&site=ehost-live

Hogben, L. (1957). Statistical theory: The relationship of probability, credibility and error. London: George Allen & Unwin.

Huff, D. (1954). How to lie with statistics. New York: W. W. Norton & Company.

Keller, D. K. (2006). The Tao of statistics: A path to understanding (with no math). Thousand Oaks, CA: Sage Publications.

Knoth, S., & Steinmetz, S. (2013). EWMA p charts under sampling by variables. International Journal of Production Research, 57(13), 3795-3807. Retrieved October 31, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=89100637&site=ehost-live

Steyn, H. S., & Ellis, S. M. (2009). Estimating an effect size in one-way multivariate analysis of variance (MANOVA). Multivariate Behavioral Research, 44(1), 106-129. Retrieved October 31, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=36449716&site=ehost-live

Suggested Reading

Asmussen, S., Kroese, D. P. & Rubinstein, R. Y. (2005). Heavy tails, importance sampling and cross-entropy. Stochastic Models, 21(1), 57-76. Retrieved August 17, 2007, from EBSCO Online Database Academic Search Complete. http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=16432808&site=ehost-live

Asmussen, S. & Pihlsgard, M. (2005). Performance analysis with truncated heavy-tailed distributions. Methodology and Computing in Applied Probability, 7(4), 439-457. Retrieved August 17, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=18900103&site=ehost-live

Ghosh, J. K., Purkayastha, S., & Samanta, T. (2004). Sequential probability ratio tests based on improper priors. Sequential Analysis, 23(4), 585-602. Retrieved August 17, 2007, from EBSCO Online Database Academic Search Complete. http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=15123758&site=ehost-live

Essay by Ruth A. Wienclaw, Ph.D.

Dr. Ruth A. Wienclaw holds a Doctorate in industrial/organizational psychology with a specialization in organization development from the University of Memphis. She is the owner of a small business that works with organizations in both the public and private sectors, consulting on matters of strategic planning, training, and human/systems integration.