Statistical hypothesis testing
Statistical hypothesis testing is a fundamental concept in statistics that involves evaluating claims or premises—known as hypotheses—about a population based on collected data. The process begins with the formulation of a null hypothesis (H0), which represents a widely accepted belief, and an alternative hypothesis (Ha), which challenges that belief. By conducting tests on gathered data, statisticians aim to determine whether the evidence supports rejecting the null hypothesis or failing to reject it, thus maintaining that the null hypothesis cannot be disproven.
Historically, hypothesis testing has evolved significantly, with early contributions from figures like John Arbuthnot and later advancements by statisticians such as Ronald Fisher. The methodology is applicable across various fields, from healthcare studies to criminal justice, where it serves to assess the validity of existing beliefs against new evidence. For example, when examining the average weight of newborns, researchers might compare current data against an established average, thereby utilizing the hypothesis testing framework to derive insights about population changes. Ultimately, hypothesis testing provides a systematic approach to decision-making grounded in statistical evidence, allowing researchers to quantify their confidence in the conclusions drawn from their data analyses.
On this Page
Statistical hypothesis testing
Statistical hypothesis testing is an important topic in the field of statistics. Statistics is a mathematical science that deals with the collection, analysis, and interpretation of data. Statisticians, or people who study or use statistics, may work in a wide variety of fields and consider questions that examine human life in countless ways. They frequently use statistical hypothesis testing to examine beliefs that are currently widely accepted (referred to as “null hypotheses”) and to challenge these existing beliefs with new beliefs (referred to as “alternate hypotheses”). The goal of statistical hypothesis testing is to reject null hypotheses or to fail to reject null hypotheses—in other words, to disprove an existing belief or to accept that that belief has not been disproven.


Background
The use of statistics dates back to ancient times has grown tremendously over the centuries. In the early modern period, one of the pioneers of statistical hypothesis testing was physician John Arbuthnot. His 1710 study into the percentages of female and male children was meant to show that the sexes were held in balance, presumably by “Divine Providence.” Other statistical tests during the eighteenth century also examined birth rates, while others explored diverse features of society and nature, such as the distribution of stars in the night sky. Statistical hypothesis testing gained wider acceptance and became increasingly useful in the next centuries. Some of its major proponents have included Karl Pearson, William Gosset, Ronald Fisher, Jerzy Neyman, and Egon Pearson.
Statistical hypothesis testing may be applied to a great variety of cases, and may be adapted to suit different circumstances. However, all instances of hypothesis testing share some features in common. The basic feature of any hypothesis testing are hypotheses. This term arises frequently in the sciences, and its meaning in statistics is largely similar. A statistical hypothesis is a claim or a premise that statisticians will attempt to test.
While other scientists may study hypotheses in the laboratory using experiments or other means, statisticians usually create and test hypotheses using data gathered from surveys and other sources of information about a given population.
Once statisticians determine which hypothesis they will explore, they must devise an appropriate test. Most statistical tests begin with the selection of a statement called a null hypothesis. Although the word null generally means “empty” or “zero,” in this context it refers to a default belief, which has already been established and is currently widely accepted by most scientists. The null hypotheses may be based on previously published reports or studies. This established information might discuss any measurable feature of a population, such as height, birth rate, household income, and so on. In statistical shorthand, the null hypothesis is represented by the letters “H0,” with “H” standing for hypothesis, and the small zero denoting the idea of null. This shorthand is read aloud as H-naught.
The testing begins when the statistician introduces a new hypothesis that seeks to question, re-evaluate, or challenge the previously established null hypothesis. This new hypothesis is called the alternative hypothesis, which is styled as “Ha” (pronounced H-sub-A) in shorthand, or the research hypothesis. Basically, the alternative hypothesis states the claim that the statistician will test.
Overview
An example case that employs statistical hypothesis testing might involve a study of the weights of newborns in a particular region. Researchers may suspect that newborn babies in the region do not have the same average weight as than those of prior generations. The first step to examining this question statistically will require researchers to find established and widely accepted ideas about newborns’ weight.
Some preliminary testing has suggested that newborns weigh, on average, about seven and a half pounds. Hospital administrators believe that newborns are still averaging at that weight. This finding will provide the study with a null hypothesis. The researchers may write this information in shorthand as H0: μ = 7.5 lbs., where the μ stands for mean, or the average value of a set of data.
Next, the researchers will present their alternative, or research, hypothesis, which claims that newborn babies in the region now have a different average weight than they had when the existing data was recorded. This hypothesis may be styled as Ha: μ ≠ 7.5 lbs., with the ≠ representing “not equal.”
When established in this way, the null hypothesis and alternative hypothesis stand as mathematical opposites. Basically, the alternative hypothesis states that the null hypothesis is not correct. This mathematical opposition is a crucial feature of statistical hypothesis testing. It creates a situation in which the new, alternative hypothesis must prove its own value and accuracy against the established common knowledge. Until it does so, people are left to assume that the null hypothesis is still correct.
Now that the opposing hypotheses have been established, researchers may begin their investigation and test of the matter. They may perform surveys, gather data, and process different types of statistical information. In this case, researchers are likely to ask hospitals in the region for data about the weights of recent newborn babies. Researchers may also perform surveys among medical personnel who work in delivery rooms, or among the parents or guardians of newborn babies in the region, asking about the weight of the babies.
Researchers will seek to sample enough statistical information to allow them to formulate a new statistic. In this case, the statistic would show the average weight of newborn babies in the region in recent times. The researchers must study this new statistic to determine whether it is significantly different enough from the null hypothesis (stating that newborns have an average weight of seven and a half pounds) to be able to challenge the null hypothesis.
For instance, the researchers may determine that the average modern newborn weighs eight pounds. That represents a half-pound difference from the established number, which is most likely statistically significant and worth reporting. Alternately, the researchers may determine that that average modern newborn weighs about seven and four-tenths pounds. This result only differs by one-tenth of a pound from the accepted average. This small difference may not be statistically significant. It may be the result of chance, coincidence, or even small mathematical errors. The researchers may then decide to discard their alternative hypothesis because it does not disprove the null hypothesis, and, in fact, its findings may actually support the null hypothesis.
When the researchers’ work is completed, they may reach one of two possible outcomes. One outcome is that researchers can reject, or dismiss as untrue, the null hypothesis. At that time, the alternative hypothesis will gain status, and people will be more likely to believe it instead of the null hypothesis. The second outcome is that the researchers fail to reject the null hypothesis. That means that they cannot find a reasonable means by which to disprove the null hypothesis. However, this does not automatically prove that the null hypothesis is correct. It simply means it has not been proven incorrect. In this outcome, the null hypothesis is likely to remain the most accepted belief, while the alternative hypothesis will be discarded or revised.
Following the investigation, the researchers may calculate their level of confidence. That refers to how confident they are in their ultimate decision (i.e., whether to reject, or fail to reject, the null hypothesis). The level of confidence is usually stated as a percentage value. For example, a 60 percent level of confidence may indicate that researchers reached a conclusion but do not feel that they can strongly support it. Meanwhile, a 95 percent level of confidence suggests that researchers feel that their calculations were correct and show strong evidence in support of their final decision.
Statistical hypothesis testing may be applied to an endless variety of questions that deal with statistical information. This may deal with industry, demographics, or politics. This variety of testing even arises in the criminal justice system. A good example of statistical hypothesis testing is set in the courtroom. When a new criminal case begins, the existing belief is that the defendant is innocent until proven guilty. That belief forms the prevailing null hypothesis. The prosecutors are then challenged to find and provide evidence that can prove that the defendant is actually guilty. If they succeed, the defendant will likely be judged guilty. Until that time, or if they fail to prove guilt, the null hypothesis will persist, and the defendant will not be considered guilty.
Bibliography
Carr, Steven M. “Fundamentals of Statistical Hypothesis Testing.” Newfoundland and Labrador’s University, 2016, www.mun.ca/biology/scarr/Bio2250‗Hypothesis‗testing.html. Accessed 4 Oct. 2024.
Davis, Roger B., and Kenneth J. Mukamal. “Hypothesis Testing.” Circulation, vol. 114, no. 10, 5 Sept. 2006, www.ahajournals.org/doi/full/10.1161/circulationaha.105.586461. Accessed 4 Oct. 2024.
Heyde, C.C., and E. Seneta. Statisticians of the Centuries. Springer Science & Business Media, 2001.
“Hypothesis Testing (P-Value Approach).” Pennsylvania State University - Eberly College of Science, online.stat.psu.edu/statprogram/reviews/statistical-concepts/hypothesis-testing/p-value-approach. Accessed 4 Oct. 2024.
“Hypothesis Tests.” Applied Mathematics / College of Arts and Sciences, University of Colorado Boulder, www.colorado.edu/amath/sites/default/files/attached-files/lesson9‗hyptests.pdf. Accessed 4 Oct. 2024.
Kensler, Jennifer. “Statistical Hypothesis Testing.” Scientific Test & Analysis Techniques Center of Excellence, 27 Aug. 2018, www.afit.edu/stat/statcoe‗files/Statistical Hypothesis Testing.pdf. Accessed 4 Oct. 2024.
Lehmann, E.L., and Joseph P. Romano. Testing Statistical Hypotheses, 3rd Ed. Springer, 2008.
Majaski, Christina. “Hypothesis Testing.” Investopedia, 3 May 2024, www.investopedia.com/terms/h/hypothesistesting.asp. Accessed 4 Oct. 2024.
Schneiter, Kady. “Historical Hypothesis Testing.” Utah State University, www.usu.edu/math/schneit/StatsStuff/Inference/HistoricalHT/. Accessed 4 Oct. 2024.
“What Is Statistics?” Donald Bren School of Information and Computer Sciences, University of California, Irvine, 2024, www.stat.uci.edu/what-is-statistics/. Accessed 4 Oct. 2024.