Histograms

A histogram is a type of graph that represents the distribution of data. A histogram groups numbers into ranges. It consists of a series of rectangles with widths representing the space between ranges and heights representing the number of times something occurs within that range. For example, if a person wanted to group the heights of every tree in an orchard, a histogram would be useful. The x-axis of the histogram would represent the range of tree heights while the y-axis would represent the number of trees at each height. Histograms are useful for calculating continuous data such as height, weight, and time. Frequency histograms use vertical rows to show frequency, meaning how many times something occurs. Histograms are used for many important functions, such as calculating US Census Bureau data. Researchers often use histograms to round out their research results for easier comprehension.

98402113-29042.jpg98402113-29043.jpg

Overview

The word histogram comes from the Greek words "isto-s," which means "mast" (a long vertical shape) and "gram-ma," which means "something written." The term histogram was coined by English mathematician Karl Pearson. Historians believe that histograms were used long before a term was invented for them, however. The precursors to histograms were bar charts, which were first used by economist William Playfair in his 1786 publication The Commercial and Political Atlas. Many other academics adopted bar charts after this, including Florence Nightingale, who used them to convince the government to improve army hygiene in 1859. Researchers use bar charts to examine categorical, or individual, data. Histograms evolved from bar charts to represent a wide range of data rather than specific categories.

Data sets can be very complex. A large group of data usually has many different characteristics. Measuring this mass quantity of data individually can create very confusing results. Instead of composing a graph that shows each individual attribute of each data set, researchers utilize histograms to group their results into convenient ranges, called bins. Bins can represent ranges of tens, hundreds, thousands, etc. The bin depends on the amount of data a researcher is dealing with. The bin amounts are listed on the x-axis of a histogram. The y-axis represents the total frequency observed at various bins.

Many fields of study use histograms to track data. Histograms are especially popular in the computer science and information technology fields of academia, which commonly use histograms to process computer images. The data collected in these histograms measures the value of different pixels, or the tiny elements images are composed of. This information makes it possible to identify similar images or compress images. Histograms are also useful in computer databases, as the data contained in these systems can be very large and incomprehensible unless broken down by a histogram. The histogram is not limited to computer technology fields, however, and is used for measuring all sorts of large data quantities.

Bibliography

Ioannidis, Yannis. "The History of Histograms." In Proceedings of the 29th International Conference on Very Large Data Bases. (VLDB 2003). VLDB Endowment Inc. Web. 8 Jul. 2014. <http://www.vldb.org/conf/2003/papers/S02P01.pdf>

MathIsFun. "Histograms." MathsIsFun.com. MathsIsFun.com. Web. 8 Jul. 2014. <http://www.mathsisfun.com/data/histograms.html>

Merrian-Webster. "Definition of Histogram." Merriam-Webster. Merriam-Webster, Incorporated. Web. 8 Jul. 2014. <http://www.merriam-webster.com/dictionary/histogram>

QuarkNet. "Histograms: Construction, Analysis and Understanding." Quarknet.com. Fermi Research Alliance, LLC. Web. 8 Jul. 2014. <http://quarknet.fnal.gov/toolkits/ati/histograms.html>