Student's t-test
Student's t-test is a statistical method used to assess how variations affect small samples drawn from a larger population. It is particularly useful when the data from each population is assumed to be normally distributed and share similar standard deviations. Typically, normally distributed data appears on a graph as a symmetrical shape around the average value. Developed by British statistician William Sealy Gosset in the early 1900s, the t-test was originally designed for quality control at a brewery, allowing for effective analysis of small samples in a way that was previously not possible with larger samples.
The test can compare the means of different groups, such as evaluating the effectiveness of different teaching methods on student performance. The t-test operates under a null hypothesis, which posits that there is no significant difference between the averages of the groups being compared. If the results suggest otherwise, it indicates that the variable being tested may have had an effect. Researchers can choose between paired t-tests, which assess the same subjects before and after a change, or unpaired t-tests, which compare two independent groups. While the t-test is a powerful tool, it is important to acknowledge that experimental errors can affect the outcomes, and the results should be interpreted cautiously.
On this Page
Student's t-test
In statistics, Student's t-test is a method used to test how changes affect small samples of a larger population. To be effective, the t-test assumes that the data of each population is normally distributed and that the populations share the same standard deviation of data. Data that is normally distributed is plotted on a graph with the individual values placed around the data's average, or mean. Normally distributed data should appear on the graph as a line that begins on the x-axis for below-average values, becomes a peak at the data's average, and then returns to the x-axis for above-average values. Standard deviation, meanwhile, is the distance a data set has been scattered from its average. If the data of each sample group in a Student's t-test is normally distributed and the groups' standard deviations are about the same, the t-test can measure the differences between the samples after an experiment has been performed on them. For example, the Student's t-test might be useful in determining the difference in test results of two groups of students who were taught the same material in different ways.
![British statistician William Sealy Gosset developed the "t-statistic" and published it under the pseudonym of "Student." By User Wujaszek on pl.wikipedia [Public domain], via Wikimedia Commons rsspencyclopedia-20170119-51-154290.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/rsspencyclopedia-20170119-51-154290.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
![t-test. By Shamuswheeler (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons rsspencyclopedia-20170119-51-154291.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/rsspencyclopedia-20170119-51-154291.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
Background
British statistician and chemist William Sealy Gosset designed the Student's t-test in the late 1900s, while he was an employee of a Guinness brewery in Dublin, Ireland. In this era, it was not unusual for statisticians to be working for Guinness, as the company employed them to test and refine the ingredients in its beer. Gosset inspected both the beer itself and the barley grown by Guinness to make the beer.
Gosset was permitted to test only small samples of Guinness's products. To him, this was a problem, since he could not be sure whether his test results were standard or abnormal among the entire population of Guinness ingredients and beer. To learn this information, Gosset devised a mathematical formula specifically to measure how small sample sizes compared and contrasted to the large population from which they were taken. This formula was revolutionary for the time—statistics to that point had concentrated exclusively on testing large sample sizes, as these were believed to represent a population more accurately. One of Gosset's formulas tested small samples of barley and beer, while another calculated how individual yeast cells were distributed during brewing.
Gosset then began publishing scholarly articles about his new research methods. In 1907, he published his findings on yeast distribution in brewing. The next year, he published more about his formulas in the British scientific journal Biometrika. However, Guinness wanted to suppress the fact that it employed statisticians to improve the quality of its beer. The company dictated that none of its employees could publish research using their real names. Gosset had therefore published his research using the pseudonym "Student." His statistical test thereby became known as Student's t-test, with Gosset having randomly chosen the letter t as part of the name. In his journal articles, Gosset described his equations for calculating population averages using small sample sizes.
Gosset went on to publish works on the analysis of data from agricultural experiments. This, combined with British statistician Ronald Fisher's publicizing of the t-test, led to Gosset's research method being used for more mainstream statistical purposes, not only for product testing in breweries. The t-test revolutionized the manufacturing industry, particularly in the area of quality control, as corporate researchers could now use the test to inspect small product samples rather than collect large samples as they had previously done. The t-test also changed the field of statistics by offering statisticians a new way to test their hypotheses and examine deviations in data sets.
Overview
The t-test is most effective if its assumption that data is normally distributed in each test sample is correct. If each sample's data is normally distributed, the null hypothesis is said to be true. This hypothesis states that the averages of the two data sets are not different in any remarkable way. The t-test produces the best results when the samples are entirely independent, or randomly selected, so their data sets do not interfere with each other. Researchers who have collected their sample data should plot each data set on a graph. The data they found should be normally distributed around an average, appearing as one large peak on the graph. This normal data distribution in each sample ensures accuracy in the t-test. If one sample's data is normally distributed and the other's is significantly different, the t-test will return false results.
For instance, statisticians may use a t-test to see if different teaching methods have any effect on how two separate groups of students score on tests. In this case, the null hypothesis argues that the average scores of the two groups will be roughly similar. The t-test formula is used to try to prove the hypothesis wrong. In this example, if the null hypothesis is correct, then the t-test has indicated that different teaching methods had no effect on how well the students grasped the information and took the test.
If the hypothesis is wrong, then the t-test has suggested that the different methods may have affected students' comprehension of the material. The t-test is not meant to prove these conclusions definitively, however, since experimental error can always occur. Experimental error is any mistake made in the preparation of the experiment, including oversights such as incorrectly recording the data. Nonetheless, the t-test remains useful for observing differences between small samples and larger populations.
Researchers use two broad types of t-test: paired and unpaired. A paired t-test studies changes in one subject before and after an introduced modification. Using the same subject this way reduces the experiment's variability and makes the resulting conclusions about this population sample more accurate. Unpaired t-tests observe two groups and note what happens to each of them after a change is introduced to only one group. The results of this small-sample test suggest how the introduced change might affect others in the larger population.
Bibliography
Cassens, Brett J. Preventive Medicine and Public Health. 2nd ed., Lippincott Williams & Wilkins, 1992, pp. 59–60.
Chen, James. "Normal Distribution: What It Is, Uses, and Formula." Investopedia, 13 Mar. 2024, www.investopedia.com/terms/n/normaldistribution.asp. Accessed 13 Sept. 2024.
Cullen, Bruce F., et al. Barash, Cullen, and Stoelting's Clinical Anesthesia. 9th ed., Wolters Kluwer, 2024.
Enu, Ikay. "Student's T Test: Applications." Anesthesiology Keywords Review. Edited by Raj K. Modak, Lippincott Williams & Wilkins, 2008, pp. 448–49.
Griffiths, Hugh. "The History Column: William Sealy Gosset—'Student.'" IEEE Aerospace and Electronic Systems Society, 2023, ieee-aess.org/post/blog/history-column-william-sealy-gosset-student. Accessed 13 Sept. 2024.
Hargrave, Marshall. "Standard Deviation Formula and Uses vs. Variance." Investopedia, 5 Aug. 2024, www.investopedia.com/terms/s/standarddeviation.asp. Accessed 13 Sept. 2024.
Holcomb, Anne. "Because of Beer: 1900s Guinness Employee's Quality-Control Tests Are Mainstays in the World of Statistics." MLive.com, 16 Mar. 2009, www.mlive.com/kalamabrew/index.ssf/2009/03/because‗of‗beer‗1900s‗guinness.html. Accessed 13 Sept. 2024.
"Stata for Students: T-Tests." Social Science Computing Cooperative, 2 Sept. 2016, www.ssc.wisc.edu/sscc/pubs/sfs/sfs-ttest.htm. Accessed 13 Sept. 2024.