Histograms
A histogram is a graphical representation that illustrates the distribution of data by grouping numbers into ranges, known as bins. It consists of rectangles where the width indicates the range of data, and the height reflects the frequency of occurrences within that range. For example, a histogram can effectively display the heights of trees in an orchard, with the x-axis representing these height ranges and the y-axis showing the quantity of trees in each range. This type of graph is particularly beneficial for continuous data, such as height, weight, and time.
Histograms evolved from bar charts, which were used as early as the late 18th century by economist William Playfair. They serve to simplify complex data sets, allowing researchers to visualize large quantities of information in a more comprehensible format. In various fields, including computer science and information technology, histograms are employed to analyze data, such as pixel values in images, which aids in tasks like image identification and compression. Overall, histograms are versatile tools used across disciplines to summarize and analyze large amounts of data effectively.
On this Page
Histograms
A histogram is a type of graph that represents the distribution of data. A histogram groups numbers into ranges. It consists of a series of rectangles with widths representing the space between ranges and heights representing the number of times something occurs within that range. For example, if a person wanted to group the heights of every tree in an orchard, a histogram would be useful. The x-axis of the histogram would represent the range of tree heights while the y-axis would represent the number of trees at each height. Histograms are useful for calculating continuous data such as height, weight, and time. Frequency histograms use vertical rows to show frequency, meaning how many times something occurs. Histograms are used for many important functions, such as calculating US Census Bureau data. Researchers often use histograms to round out their research results for easier comprehension.
![A plot showing a regular and a cumulative histogram of the same data. The data shown is 10,000 points randomly sampled from a normal distribution with mean of 0 and standard deviation of 1. By Kierano (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons 98402113-29042.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/98402113-29042.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
![histogram of travel time (US Census 2000 data), new version made in Stata Qwfp at English Wikipedia [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons 98402113-29043.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/98402113-29043.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
Overview
The word histogram comes from the Greek words "isto-s," which means "mast" (a long vertical shape) and "gram-ma," which means "something written." The term histogram was coined by English mathematician Karl Pearson. Historians believe that histograms were used long before a term was invented for them, however. The precursors to histograms were bar charts, which were first used by economist William Playfair in his 1786 publication The Commercial and Political Atlas. Many other academics adopted bar charts after this, including Florence Nightingale, who used them to convince the government to improve army hygiene in 1859. Researchers use bar charts to examine categorical, or individual, data. Histograms evolved from bar charts to represent a wide range of data rather than specific categories.
Data sets can be very complex. A large group of data usually has many different characteristics. Measuring this mass quantity of data individually can create very confusing results. Instead of composing a graph that shows each individual attribute of each data set, researchers utilize histograms to group their results into convenient ranges, called bins. Bins can represent ranges of tens, hundreds, thousands, etc. The bin depends on the amount of data a researcher is dealing with. The bin amounts are listed on the x-axis of a histogram. The y-axis represents the total frequency observed at various bins.
Many fields of study use histograms to track data. Histograms are especially popular in the computer science and information technology fields of academia, which commonly use histograms to process computer images. The data collected in these histograms measures the value of different pixels, or the tiny elements images are composed of. This information makes it possible to identify similar images or compress images. Histograms are also useful in computer databases, as the data contained in these systems can be very large and incomprehensible unless broken down by a histogram. The histogram is not limited to computer technology fields, however, and is used for measuring all sorts of large data quantities.
Bibliography
Ioannidis, Yannis. "The History of Histograms." In Proceedings of the 29th International Conference on Very Large Data Bases. (VLDB 2003). VLDB Endowment Inc. Web. 8 Jul. 2014. <http://www.vldb.org/conf/2003/papers/S02P01.pdf>
MathIsFun. "Histograms." MathsIsFun.com. MathsIsFun.com. Web. 8 Jul. 2014. <http://www.mathsisfun.com/data/histograms.html>
Merrian-Webster. "Definition of Histogram." Merriam-Webster. Merriam-Webster, Incorporated. Web. 8 Jul. 2014. <http://www.merriam-webster.com/dictionary/histogram>
QuarkNet. "Histograms: Construction, Analysis and Understanding." Quarknet.com. Fermi Research Alliance, LLC. Web. 8 Jul. 2014. <http://quarknet.fnal.gov/toolkits/ati/histograms.html>