Normal and Binomial Distributions

The normal distribution is a family of idealized bell-shaped curves derived from a mathematical equation. Normal distributions are unimodal, symmetrical about the mean, and have an area that is always equal to 1. In addition, normal distributions are continuous rather than discrete and are asymptotic to the horizontal axis. Normal distributions can vary on several factors including central location, variation, skewness, and kurtosis. Another well-known distribution is the binomial distribution. This is a discrete distribution that occurs when a single trial of an experiment has only two possible outcomes: Success or failure. In some situations, the normal distribution can be used to approximate the binomial distribution. The fact that an underlying distribution approximates the normal distribution can be leveraged so that inferential statistics can be applied to the data in order to do hypothesis testing.

In both business and research, one is constantly bombarded with data. The question in both areas, however, is how to best interpret these data. Although it is nice to know that one received a rating of 85 on a 100 point scale, this statistic alone does not truly provide much information. Other questions need to be asked including what the mean score and the standard deviation for the sample were. Further, it would be helpful to know how many of those evaluated received a score of 85 or above. If of the people or companies evaluated, 90 percent of them received a score of 85 or above, the score is not so remarkable. If, on the other hand, only 2 percent received a score of 85 or above, a different interpretation should be put on this information. To understand where one's score falls within the larger group of scores from all the people rated or tested, one needs to understand the underlying distribution -- a set of numbers collected from data and their associated frequencies -- within which the score is situated.

Normal Distribution

Although there are as many distributions as there are individual collections of data, there also exists the concept of a "normal" distribution that describes the population from which the sample distributions are drawn. The normal distribution is an idealized bell-shaped curve that is derived from a mathematical equation (Figure 1). Although "the" normal distribution is hypothetical, the family of normal distributions describes a wide variety of characteristics occurring in nature as well as in business and industry. For example, many characteristics of humans including height, weight, speed, life expectancy, IQ, and scholastic achievement all fit within the paradigm of the normal distribution. Similarly, many variables more directly related to business concerns also have a normal distribution. For example, the cost of household insurance, rental cost for square foot of warehouse space, employee satisfaction, performance appraisal ratings, and percentage of defects on a production line can all take the shape of a normal distribution. On a more practical level, the normal distribution provides the basis for many aspects of inferential statistics and hypothesis testing. In addition, it is used by human factors engineers in designing equipment (e.g., so that it can be usable by a wide range of people such as those between the 5th percentile woman through the 95th percentile man) and by quality control engineers to determine whether or not a process is within quality standards. The normal distribution is also referred to as the Gaussian distribution after its discoverer Karl Gauss, an astronomer in the early nineteenth century. Gauss observed that when objects are repeatedly measured, the measurement errors are typically distributed normally. For this reason, the normal distribution is also sometimes referred to as the normal curve of errors.

Characteristics of Normal Distributions

Normal distributions have several distinguishing characteristics. Normal distributions are continuous rather than discrete and are asymptotic to the horizontal axis (i.e., they never cross or touch the axis, but continue into infinity becoming ever closer to the axis). Normal distributions are also unimodal (i.e., have only one mound), such that the mound is in the center and the graph of the distribution is symmetrical about its mean (i.e., the two halves are mirror images of each other). Another property of normal distributions is that the area under the curve is always equal to 1.

Variations in Normal Distributions

Specific normal distributions can vary on a number of different characteristics, particularly: central location, variation, skewness, and kurtosis.

  • Central location is the value of the midpoint of the distribution. Measures of central tendency for a distribution include the median (the number in the middle of the distribution), the mode (the number occurring most often in the distribution), and the mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution).
  • Variation is the degree to which the values cluster around the central value. If the values are clustered closely together there will be less variation than if the values are spread further apart.
  • Skewness refers to whether or not the distribution is symmetrical around its central value. Skew refers to the end of the distribution where there is the least concentration of values. As shown in Figure 2, distributions that are asymmetrical and whose values cluster to the left of the central value (i.e., the major of the values are lower than the central value) are said to have positive skew because the longer tail of the distribution is on the positive side of the central value. Distributions whose values cluster on the right of the central value are said to have negative skew (i.e., the longer tail is on the negative side of the central value). Distributions that are symmetrical about the central value are not skewed. Skewness can be an important consideration in statistics. For example, the mode is by definition influenced by the skew. The median, although more stable than the mode, is also influenced by the direction of the skew.

  • Another characteristic on which frequency distributions can differ is kurtosis. This characteristic refers to the degree to which a distribution is peaked (i.e., whether it is flat or tall in comparison with the normal distribution) near the central point of the distribution. As shown in Figure 3, a distribution is said to be leptokurtic if it is tall and thin in comparison with the normal distribution or platykurtic if it is flat in comparison with the normal distribution. Normally shaped distributions are said to be mesokurtic.

As mentioned above, the normal distribution is not a single distribution, but is actually a family of curves. In fact, every unique mean and every unique standard deviation results in a different normal distribution. As shown in Figure 4, two distributions that are alike in every way except with different means (μ, represented by the circle in Figure 4) will look identical but be in different places on the x-axis. Similarly, changes in the standard deviation (s, represented by a diamond in Figure 4) of a distribution will change how flat and wide it is.

Binomial Distributions

Another well-known distribution is the binomial distribution. This is a discrete distribution that occurs when a single trial of an experiment has only two possible outcomes: success or failure. This situation describes situations such as tossing a coin (the two possibilities are heads or tails), some quality control situations where the product either passes inspection or fails, or passing a pass/fail training class. In addition, the binomial distribution is based on the assumption that the probability of getting a success on any one trial (p) and the probability of getting a failure (q = 1-p) on any one trial remain constant across all of the trials. Success and failure in a binomial distribution are mutually exclusive: a coin cannot be simultaneously heads up and tails up, a product cannot simultaneously be acceptable and unacceptable using the same standard, and a course cannot be simultaneously passed or failed using the same learning criteria. The binomial distribution is considered to be a discrete distribution because the number of trials (e.g., how many times the coin is tossed or how many products are checked for quality) limits the number of possible successes. In addition, the binomial distribution assumes that the trials are independent. For example, getting a head on the first toss of a coin has no influence on whether or not the next toss will result in heads or tails. In some situations, the normal distribution can be used to approximate the binomial distribution. This is because as sample sizes increase, binomial distributions approach the shape of the normal distribution regardless of the value of p. The normal distribution is the limiting form of the binomial distribution.

Applications

Frequency Distributions

In statistics, one of the basic techniques for understanding a set of data is the development of a frequency distribution. To turn a collection of raw data into a frequency distribution, the data are divided into intervals of typically equal length and graphed with scores on the x-axis and frequencies of scores (within the intervals) on the y-axis. By graphing data within intervals in this way, the number of data points on the graph is reduced and the graph -- and the underlying data -- becomes easier to comprehend. For example, Figure 5 shows the data graphed on a scatter plot from 50 people rating a new widget design on a scale from 1 to 100. Although the scatter plot correctly shows that the rating most frequently received was 30, this does not mean that the average value of the ratings was 30. (The median is actually 41 and the mean is 43.18). In addition, even though the 100-point scale may seem quantitative because there are numbers attached, the underlying ratings are still qualitative in nature: Harvey may think that an above average score is 60 while Mathilde may think that an above average score is 80. Therefore, one must consider whether or not there really is a meaningful difference between a rating of 22 on a 100-point scale and a rating of 23. In either case, the person responding did not like the new widget.

To help overcome these problems and determine whether or not an underlying distribution approximates the normal distribution, data sets are typically aggregated within intervals and then graphed. If the same data were divided into 10 equal intervals (i.e., scores of 1-10, 11-20, etc.) and the frequencies graphed on a histogram, the number of points on the graph would be reduced so that larger patterns can emerge and be understood. As shown in Figure 6, one can now see that the sample distribution for these ratings approximates the normal distribution.

Frequency Distribution & Hypothesis Testing

The fact that an underlying distribution approximates the normal distribution can be leveraged so that inferential statistics can be applied to the data in order to do hypothesis testing. Armed with the knowledge that the area under the normal distribution is always equal to 1, one can go on to determine whether the probability of the results of various inferential statistics is due to chance or a real difference in the underlying populations. For example, if one wanted to estimate the number of widgets coming off the production line that would pass quality criteria, one could use the normal distribution. To do this, one would first need two pieces of information about the distribution: The mean (X¾) and the standard deviation (Sx) of the sample. These two pieces of information help one understand the shape of the curve. The mean acts as a marker to help one understand where in the distribution a particular score lies while the standard deviation acts as a scale on a map to let one know how far from the mean a particular scores lies. To interpret the table for the normal distribution, one also needs a third piece of information called a z-score. A z-score is the number of standard deviations units that a given score is above or below the mean of the distribution. The formula for deriving a z-score is z = (X -- X¾)/ Sx. This formula can be used to transform a set of raw scores with any mean and standard deviation into z-scores that have a mean of 0 and a standard deviation of 1. When raw scores are translated into z-scores, one can better interpret what that score means because it states how many standard deviations that score was above the mean (Figure 7).

The Altman Z-score

A classic example of how the normal distribution can be applied in business is the prediction of bankruptcy. The Altman Z-score is a multivariate formula that is used to measure the financial health of a company and forecast its probability of filing for bankruptcy. This technique is used by credit analysts, bond rating agencies, investment firms, traders, and academics. It has been found to be very successful in predicting near-term financial distress. Bankruptcies are a major concern for credit professionals. Part of the reason that business bankruptcies can be so troubling is that they often happen suddenly, with many of the companies filing for bankruptcy meeting their current financial obligations right up until the time that they file for bankruptcy. However, Z-scores have been successfully used for years to predict business bankruptcies as far as one year in advance of filing with up to 80 percent accuracy. As a result, credit professionals typically include a calculation of the Z-score in their analysis of credit applicants and current customers.

A 2006 study applied the Altman Z-score methodology to an investigation of the financial distress of 50 teaching hospitals. As opposed to most studies that have focused on the profitability, liquidity, debt structure, and efficiency of teaching hospitals, Langabeer included a calculation of the Z-scores of the hospitals in order to determine their financial distress. Previous criteria of insolvency included number of days of cash on hand to cover normal operation expenses and similar indicators that occur too late in the process to be of use in rectifying the situation.

The 2006 study found that nearly 17 percent of the hospitals sampled were in the danger of filing for bankruptcy in the near future with the majority of the others not far behind. The mean Z-score decreased from 4.67 to 4.08 from 2002 to 2004, a trend toward increased financial distress. Not only did the study find that the scores were trending in the wrong direction, but also that the distribution of the hospitals within the population was skewed with most of the hospitals studied headed towards financial distress. The study concluded that teaching hospitals were operating in a much more turbulent and competitive environment than in previous periods, and that they needed to use more advanced financial strategies than had typically been done.

Business applications of the normal distribution and z-scores are not limited to financial applications, however. Liu used z-scores to investigate variations in social quality of life indicators in 83 medium metropolitan areas, including individual concerns, individual equality, and community living conditions. The purpose of the research was to develop a commonly accepted value system to rate quality of life. The resultant conceptual model can be used to systematically evaluate various social elements of quality of life and living concerns within American urban areas.

Terms & Concepts

Binomial Distribution: A discrete distribution that occurs when a single trial of an experiment has only two possible outcomes: success or failure. As sample sizes increase, binomial distributions approach the shape of the normal distribution.

Data: (sing. datum) In statistics, data are quantifiable observations or measurements that are used as the basis of scientific research.

Distribution: A set of numbers collected from data and their associated frequencies.

Frequency Distribution: A graphing technique in which an observed distribution is sectioned into intervals (typically of equal size) and the data within the intervals are summarized and displayed in a bar chart.

Hypothesis: An empirically testable declaration that certain variables and their corresponding measures are related in a specific way proposed by a theory. A null Hypothesis (H0) is the statement that the findings of an experiment will show no statistical difference between the current condition (control condition) and the experimental condition.

Inferential Statistics: A subset of mathematical statistics used in the analysis and interpretation of data. Inferential statistics are used to make inferences such as drawing conclusions about a population from a sample and in decision making.

Kurtosis: The degree to which a distribution is peaked (i.e., whether it is flat or tall in comparison with the normal distribution) near the mean of the distribution. Leptokurtosis occurs when the distribution is tall; platykurtosis occurs when the distribution is flat. Normally shaped distributions are mesokurtic.

Leptokurtic: A characteristic of a distribution that is tall and thin in comparison with the normal distribution.

Mean: An arithmetically derived measure of central tendency in which the sum of the values of all the data points is divided by the number of data points.

Mesokurtic: A characteristic of a distribution which is normal in shape, i.e., neither flat and wide (platykurtic) nor tall and thin (leptokurtic).

Normal Distribution: A continuous distribution that is symmetrical about its mean and asymptotic to the horizontal axis. The area under the normal distribution is 1. The normal distribution is actually a family of curves and describes many characteristics observable in the natural world. The normal distribution is also called the Gaussian distribution or the normal curve of errors.

Platykurtic: A characteristic of a distribution that is flat and wide in comparison with the normal distribution.

Sample: A subset of a population. A random sample is a sample that is chosen at random from the larger population with the assumption that such samples tend to reflect the characteristics of the larger population.

Skewed: A distribution that is not symmetrical around the mean (i.e., there are more data points on one side of the mean than there are on the other).

Standard Deviation: A measure of variability that describes how far the typical score in a distribution is from the mean of the distribution. The standard deviation is obtained by determining the deviation of each score from the mean (i.e., subtracting the mean from the score), squaring the deviations (i.e., multiplying them by themselves), adding the squared deviations, and dividing by the total number of scores. The larger the standard deviation, the farther away it is from the midpoint of the distribution.

Bibliography

Black, K. (2006). Business statistics for contemporary decision making (4th ed.). New York: John Wiley & Sons.

Ferguson, G. A. (1971). Statistical analysis in psychology and education (3rd ed.). New York: McGraw-Hill Book Company.

Griffith, D. A. (2013). Better articulating normal curve theory for introductory mathematical statistics students: Power transformations and their back-transformations. American Statistician, 67(3), 157-169. Retrieved December 3, 2013 from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=90179016

Hlouskova, J., Mikocziova, J., Sivak, R., & Tsigaris, P. (2014). Capital income taxation and risk-taking under prospect theory: the continuous distribution case. Finance A Uver: Czech Journal of Economics & Finance, 64(5), 374–391. Retrieved November 24, 2014, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=99052948

Jance, M. (2013). The bakery. Journal of Business Case Studies, 9(6), 415-428. Retrieved December 3, 2013 from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=92521140

Langabeer, J. (2006). Predicting financial distress in teaching hospitals. Journal of Health Care Finance, 33(2), 84-92. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=23606137&site=ehost-live

Lemke, E. & Wiersma, W. (1976). Principles of psychological measurement. Chicago: Rand McNally College Publishing Company.

Liu, B.-C. (1978). Variations in social quality of life indicators in medium metropolitan areas. American Journal of Economics & Sociology, 37(3), 241-260. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=4511757&site=ehost-live

Weiß, C., & Kim, H. (2013). Parameter estimation for binomial AR(1) models with applications in finance and industry. Statistical Papers, 54(3), 563–590. Retrieved November 24, 2014, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=88368806

Witte, R. S. (1980). Statistics. New York: Holt, Rinehart and Winston.

Sample Size in Usability Studies. (2012). Communications of the ACM, 55(4), 64-70. Retrieved December 3, 2013 from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=74716947

Z-score: The old reliable method for predicting bankruptcy still works. (2003). Managing Credit, Receivables & Collections, 3(10), 6-7. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=10808633&site=ehost-live

Suggested Reading

Amezziane, M. (2012). A binomial model for the kernel density estimator and related inference. Journal of Statistical Computation & Simulation, 82(2), 151-164. Retrieved December 3, 2013 from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=71115734

Alzaid, A. A., & Omair, M. A. (2012). An extended binomial distribution with applications. Communications in Statistics: Theory & Methods, 41(19), 3511-3527. Retrieved December 3, 2013 from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=82902087

Bee, M. (2004). Testing for redundancy in normal mixture analysis. Communications in Statistics: Simulation & Computation, 33(4), 915-936. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=15399601&site=ehost-live

Chunlei, W., Rodan, S., Fruin, M., & Xiaoyan, X. (2014). Knowledge networks, collaboration networks, and exploratory innovation. Academy of Management Journal, 57(2), 454–514. Retrieved November 24, 2014, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=95609985

Mahoney, J. F. (1998). The influence of parent population distribution on d[SUB2] values. IIE Transactions, 30(6), 563-569. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=11873757&site=ehost-live

Shih, W. (1980). Optimal inventory policies when stockouts result from defective products. International Journal of Production Research, 18(6), 677-686. Retrieved September 7, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=5782775&site=ehost-live

Essay by Ruth A. Wienclaw, PhD

Dr. Ruth A. Wienclaw holds a doctorate in industrial/organizational psychology with a specialization in organization development from the University of Memphis. She is the owner of a small business that works with organizations in both the public and private sectors, consulting on matters of strategic planning, training, and human/systems integration.