Statistical Reasoning
Statistical reasoning is the process of using statistical methods and principles to analyze data, draw conclusions, and make informed decisions. It involves understanding both descriptive and inferential statistics, which help in summarizing data and making predictions about larger populations based on sample observations. A critical aspect of statistical reasoning is recognizing the potential for misinterpretation, as statistics can easily be manipulated or misunderstood without a solid grasp of underlying principles, such as correlation versus causation.
In practice, statistical reasoning is essential in various fields, including business, where it can aid managers in optimizing decision-making and improving profitability. For example, grasping the nuances of measures of central tendency—mean, median, and mode—can significantly influence interpretations of salary data or other metrics. Additionally, recognizing biases in experimental design, such as selection or participation bias, is crucial for obtaining valid results. Ultimately, statistical reasoning empowers individuals to employ critical thinking when evaluating data, ensuring that conclusions drawn are based on sound analysis rather than erroneous assumptions.
Subject Terms
Statistical Reasoning
Although statistical techniques can help one better interpret the multitude of data in the world around us, in the end, statistics are only tools and must be interpreted by human beings. Opportunities to misinterpret statistics abound. Without understanding the principles behind statistical methods, it is difficult to analyze data or to correctly interpret the results. Similarly, a lack of understanding regarding the ways probability works can result in poor experimental design that yields spurious results. However, understanding the principles underlying statistical methods can enable one to better apply critical thinking and statistical reasoning skills to the analysis and interpretation of data. In the business world, these tools can be used to enable managers and others to make better decisions to optimize the effectiveness and profitability of the organization.
Statistics
Overview
The gambler's fallacy is a logical fallacy in which someone incorrectly believes that one random event can be predicted from another. For example, a gambler operating under this fallacy might incorrectly assume that because the roulette wheel has landed on red the last six instances, the "law of averages" dictates that it will land on black the next time. However, this supposed law, which assumes that events even out over time, is merely wishful thinking and does not represent the way that probability really works. In fact, the wheel has the same chance of landing on red as it does on black each and every time it is spun.
Logical fallacies and erroneous thinking about the meaning of probability and stochastic processes are not confined to the gaming tables. Without understanding the principles behind statistical methods, it is difficult to analyze data or to correctly interpret the results. Take another classic example: It has been noted that the number of births in the villages of a certain northern European country were highly correlated with the number of storks seen nesting in the chimneys. Tongue-in-cheek, the researchers used this to conclude that storks bring babies. However, they went on to explain that this conclusion would be erroneous because the correlation coefficient r only suggests whether or not two variables are related, not whether one variable is dependent on the other. The truth was probably that the human parents had been enjoying the warm summer months, which meant the deliveries were coincidentally timed with the appearance of the storks happily nesting over the warm chimneys the following spring. The correlation, therefore, was incidental and not causal.
Although statistics can help one better interpret data, trends, and other empirical evidence, they are only tools and must be interpreted by human beings. Statistics merely describe trends and probabilities and must be interpreted in context. This is not necessarily a straightforward process. For example, a student once announced that he had decided to major in a specialty because the average salary was higher than in other professions he had considered. Although his conclusion was valid based on his understanding of the data and interpretation of statistics, he had failed to take a number of other variables into account in his calculations. For example, he did not know on what the "average" was based--that is, whether it was the mean, median, or mode--or what the standard deviation of the distribution was. If the salary distribution for the profession were skewed (i.e., not symmetrical around the mean, so that there are more data points on one side of the mean than on the other, while those on the other side tend to be outliers) by a few people who make an extraordinarily high amount of money, for example, then the realistic average salary could be much lower. In addition, there was no guarantee that he would earn the "average" salary straight after graduation -- or ever. He would have to graduate college and earn two graduate degrees before he was ready to be considered for a professional salary. Even if he were able to vault these hurdles, his actual salary would still depend on his experience, grades, and other qualifications.
Another example of how measures of central tendency can potentially be misinterpreted involves the characteristics of the three different types: mean, median, and mode. Each of these measures has different characteristics and is more useful in certain situations than in others, depending on the characteristics of the underlying data. In a skewed distribution, the median tends to be pulled in the direction of the skew (i.e., toward the end of the distribution with the outliers). Therefore, if the extreme ends are balanced (i.e., not skewed), the median is not affected. However, in situations where these ends are not balanced and data are clustered toward one end of the distribution, the median may disproportionately reflect the outlying data points.
The mean is even more affected by extreme values. Using the example of the person who is considering a career based on average salary, if the average salary reported were the mode and most of the people in that occupation only made $20,000 per year, it would be quite different than if the statistic reported were the mean, which is pulled in the direction of the skew. As shown in Figure 1, the difference between the various measures of central tendency is real, and the measures are not interchangeable. The "average salary" in this example is much closer to the mode than it is to the mean, because of the small proportion of people who make much more than the rest.
Opportunities to misinterpret statistics abound. Every time one opens the newspaper, for example, graphs, statistics, and interpretations leap off the page. An advertisement for a book may state that the authors have a combined experience of 50 years in the field. However, if there are five authors who each have 10 years of experience, it does not necessarily follow that the book is more worthwhile than a book that was written by one person with 40 years' experience. The latter book will more than likely have more insights than a book written by several people, each of whom only have a little experience. However, the quoted statistic does not reflect this difference.
It is not only descriptive statistics that can be misinterpreted; inferential statistics, too, are open to interpretation errors. As mentioned above, one statistic that is frequently misinterpreted is the coefficient of correlation. This inferential statistic is used to determine the degree to which values of one variable are associated with values of another variable. For example, in general, it would be fair to say that weight gain in the first year of life is positively correlated with age; in other words, the older the baby is, the more it is likely to weigh. However, this same correlation would not apply to most adults, as heavier adults are not necessarily older than lighter adults. Correlation only shows the relationship between the two variables; it does not explain why the relationship occurs or what caused it.
Inferential statistics are used for hypothesis testing to make inferences about the qualities or characteristics of a population based on observations of a sample. Statistics are used to test the probability of the null hypothesis (H0) being true. The null hypothesis is the statement that there is no statistical difference between the status quo and the experimental condition. If the null hypothesis is true, then the treatment or characteristic being studied made no difference to the end result. For example, a null hypothesis might state that whether a person is a child or an adult has no bearing on whether he or she prefers Super Crunchies cereal or Nutty Flakies cereal. The alternative hypothesis (H1), on the other hand, would state that there is in fact a relationship between the two variables.
In addition, lack of understanding of the way that probability works can result in poor experimental design that yields spurious results. The results of a statistical data analysis do not prove whether or not the hypothesis is true; it simply shows whether there is a probability of the hypothesis being true at a given confidence level. For example, if a t-test or an analysis of variance results in a value that is significant at the a = .05 level, this means not that the hypothesis is true but that the analyst is willing to run the risk of being wrong five times out of 100.
In other words, there is a possibility of error when interpreting statistics and either accepting or rejecting the null hypothesis. Type I error occurs when one incorrectly rejects the null hypothesis and accepts the alternate hypothesis. In a Type I error, the analyst might conclude that adults enjoy Super Crunchies while children do not when, in fact, there is no difference. Type II error occurs when one incorrectly accepts the null hypothesis. For example, if the analyst interpreted the statistics to mean that children and adults both equally enjoy Super Crunchies when in fact adults prefer it more than children do, then a Type II error would have occurred.
In addition, not every statistical technique is appropriate for use in every situation. Some techniques, for example, assume that the samples being analyzed are not dependent, whereas others do not make this assumption. There are a wide variety of statistical techniques available for data analysis, and the researcher needs to be careful to pick the correct one to match the characteristics of the data being analyzed. For example, although a complex situation with multiple variables can be analyzed using multivariate techniques such as multivariate analysis of variance (MANOVA), the complexity of both the technique and the interpretation of the results makes it tempting to use multiple simple techniques, such as t-tests, to test pairs of variables individually with the hope that something will be significant. However, the more tests that are run on a single set of data, the more probable it is that spuriously significant results will occur merely by chance. This approach is often referred to as "shotgunning," and conclusions drawn from such analyses are suspect at best.
Applications
Understanding the principles underlying statistical methods can enable one to better apply critical thinking and statistical reasoning skills to the analysis and interpretation of data. This understanding starts with the development of a reasoned experimental design or mathematical model that is based on the literature and empirical observation rather than on a shotgun approach where any conceivable variable is thrown into the mix with the hope that the truth will somehow emerge.
Once the null and alternate hypotheses are formulated, an experimental design is developed that allows the researcher to empirically test the hypothesis. This process also involves the determination of what statistical method is most appropriate for the data to be analyzed. Mathematical statistics offers a wide range of techniques for analyzing data depending on what one is trying to do (e.g., compare the equivalence of the means of two populations, predict future events from the knowledge of current or past events, determine the degree of association of two or more variables). To choose the correct technique in order to reduce the possibility of spurious results, one must understand the underlying nature of the data as well as what one is trying to test.
Before actually collecting the data, one must first define the sample from whom the data are to be collected. The sample needs to be representative of the population about which the analyst is trying to draw conclusions. This can be done in a number of different ways. A simple random sample is created by randomly selecting people from the population, such as by having a computer pick people at random or by selecting names from a hat. This has the advantage that, based on the laws of probability, it will more than likely be representative of the underlying population. On the other hand, it can be difficult to get a truly random sample. For example, if one wants to know consumer's opinions about a new product, one could pass out samples and collect feedback at the local mall. However, if this data collection were done during a weekday, when most people would be at work, the probability of getting working adults as part of the sample (or even school-aged children, if it were during the school year) would be greatly diminished. Therefore, even if the participants in the study were randomly chosen, the sample (e.g., people in the mall at 2:00 p.m. on Tuesday) would not necessarily represent the population of all shoppers who go to that mall. In addition, it would be difficult to randomly pick who would participate in the study. Just because the computer said that the 144th person to walk in the door should take the survey does not mean that he or she would be willing to participate. In this way, samples can be self-selecting rather than random.
Another way to sample is through systematic sampling. In this scenario, the researcher could select every 20th person who walks in the door of the mall to participate in the survey. It is easier to select the participants in this scenario, but it still may not be a truly random sample depending on self-selection, what door one chose, the time of day, and so forth. One could also choose a convenience sample. In this case, one might ask whomever looks approachable, appears to be interested in the survey or the product, or otherwise is convenient to include in the sample. The advantage of this approach is that it is easy to pick a sample. However, it is also very unlikely that a convenience sample will be truly representative of the underlying population. All the participants whom it is convenient to collect data from may share one or more characteristics, such as attractiveness to the person who is collecting the data, extroversion, or not being employed full time. Often, a better way to select a sample is to choose a stratified random sample. In this approach, one a priori determines what general characteristics one wants to include in the sample (e.g., an equal number of women and men; equal numbers of children, young adults, and adults). Within each of these subgroups (strata), a random sample is chosen. This approach allows one to gather information about specific subgroups of the population and is most likely to yield an accurate representation of each group. On the other hand, this approach has the potential drawback of introducing bias in some instances.
Conclusion
Statistical bias is the tendency for a given experimental design or implementation to unintentionally skew the results of the experiment. Selection bias occurs when the sample asked to participate in the study is selected in a way that is not representative of the underlying population. For example, in the illustration above concerning asking people at the mall in the middle of a weekday to participate, the results of the study could be unfairly biased in the direction of the opinions of people who for whatever reason have the ability to be at the mall during the day. In addition, one can introduce participation bias into the sample. This occurs when the study participants self-select, either by volunteering or by refusing to participate. This problem is often encountered when trying to collect data through mail. The participants are free to complete the survey or not, and in the great majority of the cases, they do not. As a result, the self-selected sample chosen may very likely be biased.
Terms & Concepts
Bias: The tendency for a given experimental design or implementation to unintentionally skew the results of the experiment.
Correlation: The degree to which two events or variables are consistently related. Correlation may be positive (as the value of one variable increases, the value of the other variable increases), negative (as the value of one variable increases, the value of the other variable decreases), or zero (the values of the two variables are unrelated). Correlation does not imply causation.
Descriptive Statistics: A subset of mathematical statistics that describes and summarizes data.
Distribution: A set of numbers collected from data and their associated frequencies.
Inferential Statistics: A subset of mathematical statistics used in the analysis and interpretation of data.
Mathematical Statistics: A branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provide the theoretical underpinnings for various applied statistical disciplines, including business statistics, in which data are analyzed to find answers to quantifiable questions.
Measures of Central Tendency: Descriptive statistics that are used to estimate the midpoint of a distribution. Measures of central tendency include the median (the number in the middle of the distribution), the mode (the number occurring most often in the distribution), and the mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution).
Null Hypothesis (H0): The statement that the findings of the experiment will show no statistical difference between the control condition and the experimental condition.
Population: The entire group of subjects belonging to a certain category, such as all women between the ages of 18 and 27, all dry-cleaning businesses, or all college students.
Probability: A branch of mathematics that deals with estimating the likelihood of an event occurring. Probability is expressed as a value between 0 and 1, which is the mathematical expression of the number of actual occurrences to the number of possible occurrences of the event. A probability of 0 signifies that there is no chance that the event will occur, while 1 signifies that the event is certain to occur.
Sample: A subset of a population. A random sample is a sample that is chosen at random from the larger population with the assumption that it will reflect the characteristics of the larger population.
Standard Deviation: A measure of variability that describes how far the typical score in a distribution is from the mean of the distribution. The larger the standard deviation, the farther away the typical score is from the mean.
Stochastic: Involving chance or probability. Stochastic variables are random or have an element of chance or probability associated with their occurrence.
Variable: An object in a research study that can have more than one value.
Bibliography
Armore, S. J. (1966.) Introduction to statistical analysis and inferences for psychology and education. New York: John Wiley & Sons.
Hanson-Hart, Z. (n.d.). Statistical reasoning. Retrieved 24 July 2007, from http://www.math.temple.edu/~zachhh/ch5.pdf.
Huff, D. (1954). How to lie with statistics. New York: W. W. Norton & Company.
Lancaster, G. A. (2011). How statistical literacy, official statistics and self-directed learning shaped social enquiry in the 19th and early 20th centuries. Statistical Journal of the IAOS, 27(3/4), 99-111. Retrieved November 26, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/ login.aspx?direct=true&db=bth&AN=61249900&site=ehost-live
Larwin, K. H., & Larwin, D. A. (2011). Evaluating the use of random distribution theory to introduce statistical inference concepts to business students. Journal of Education for Business, 86, 1-9. Retrieved November 26, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=t rue&db=bth&AN=54533309&site=ehost-live
Witte, R. S. (1980). Statistics. New York: Holt, Rinehart & Winston.
Suggested Reading
Millsap, R. E. & Meredith, W. (1994). Statistical evidence in salary discrimination studies: Nonparametric inferential conditions. Multivariate Behavioral Research, 29, 339-364. Retrieved July 24, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/ login.aspx?direct=true&db=bth&AN=6376779&site=ehost-live
Moseley, L. G.; Mead, D. M. (2004). When is it safer to say nothing? Some considerations on biases in sampling. Nurse Researcher, 12, 20-34. Retrieved July 24, 2007, from EBSCO Online Database Academic Search Complete. http://search.ebscohost.com/ login.aspx?direct=true&db=a9h&AN=14502343&site=ehost-live
Senn, S. (1998). Mathematics governess or handmaiden? Journal of the Royal Statistical Society: Series D (The Statistician), 47, 251-259. Retrieved July 24, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/ login.aspx?direct=true&db=bth&AN=4520038&site=ehost-live
Villarreal, A. (2012). Flawed statistical reasoning and misconceptions about race and ethnicity. American Sociological Review, 77, 495-502. Retrieved November 26, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/ login.aspx?direct=true&db=bth&AN=76332121&site=ehost-live