Estimating unemployment

Summary: Unemployment rates are calculated using intricate statistical models and sampling methods.

An unemployed person is generally defined as an individual who is available for work but who currently does not have a job. Overall unemployment is typically quantified using the unemployment rate, which represents the number unemployed people as a percent of the labor force. The Bureau of Labor Statistics is an independent statistical agency of the U.S. federal government primarily responsible for measuring labor market activity. Many mathematicians and statisticians are involved in data collection, modeling, and estimation of employment activity, including the highest levels of direction and management. For example, Janet Norwood was the first woman commissioner of the U.S. Bureau of Labor Statistics and frequently spoke to the Joint Economic Committee and other congressional Committees. She was also president of the American Statistical Association and chair of the Advisory Council on Unemployment Compensation. Regarding her work, she noted, “These data figure very prominently in most of the political debates, so it is extremely important that they be accurate and of high quality, and that they be released in a manner that is totally objective.”

Economist John Maynard Keynes’s revolutionary work, The General Theory of Employment, Interest and Money, was published in 1935–1936. The Industrial Revolution and shift away from an agrarian economy had significantly changed the way in which researchers in many fields looked at economic measures, including employment, and the Great Depression brought even greater attention and emphasis to these concepts. Because of labor-market volatility in the late 1920s, the 1930 U.S. census attempted the first comprehensive federal measure of unemployment, but data from the decennial census were not timely enough to be useful in assessing the effectiveness of Depression legislation to aid unemployed workers. Statisticians used newly emerging polling methods to develop better measures and mathematical models. Better methods also changed, at times, the definition of unemployment. Overall, it is commonly accepted that unemployment induces negative effects on the financial and economic status of societies and individuals with respect to many variables. As workers become unemployed, the goods and services that they could have produced are lost along with the purchasing power of these workers, thus leading to the unemployment of more workers. In addition, a large unemployment rate can induce significant social changes and has been the foundation of civil unrest and revolutions. Mathematicians and statisticians continue to create explanatory and forecasting models that are used to guide policies and decisions intended to stabilize economies and aid unemployed workers at local, state, and national levels. These models draw from mathematical ideas and techniques in a wide range of areas, including time series analyses, equilibrium modeling, structural component modeling, neural networks, and simulation.

Sample Design and Collection of Unemployment Data

In most countries, the task of collecting and analyzing unemployment-related information is assigned to certain governmental agencies. In the United States, the Current Population Survey (CPS), conducted by the Census Bureau for the Bureau of Labor Statistics since the mid-twentieth century, provides most of the necessary data. Counting every unemployed person each month is impractical in terms of both cost and time, so the Census Bureau conducts a monthly survey of the population using a sample of households that is designed to represent the civilian population of the United States. At the start of the twenty-first century, the (CPS) surveyed about 50,000 households per month. The selection is generally a multistage stratified sample selected from many different sample areas. The sample provides estimates for the nation and serves as part of model-based estimates for individual states and other geographic areas.

In the first stage of sampling, the United States is divided into primary sampling units (PSUs) that usually consist of a metropolitan area, a large county, or a group of smaller counties. PSUs are then grouped into strata based on some factor that divides the population into mutually exclusive homogeneous groups. The homogeneity of the stratum ensures that the within-strata variability is very small compared to the variability between strata. One PSU is then randomly selected from each stratum with a probability of selection proportional to the PSU’s population size. The second stage of sampling consists of randomly selecting small groups of housing units from the sample PSUs. Elements from this sample of housing units are called “secondary sampling units” (SSUs). These households are usually selected from the lists of addresses obtained from the last decennial census of the population. Housing units from blocks with similar demographic composition and geographic proximity are grouped together in the list. The final sample is usually described as a two-stage sample but occasionally, a third stage of sampling is necessary when actual SSU size is extremely large. In this situation, a third stage, called “field subsampling,” is needed in order to keep the surveyor’s workload manageable. This involves selecting a systematic subsample of the SSU to reduce the number of sample housing units to a more convenient number. Once a survey is designed and the sample is drawn, field representatives and computer-assisted telephone interviewers contact and interview a responsible person living in each of the sample units selected to complete the interview.

Seasonal Adjustment of Unemployment Data

The collected data by the CPS are subjected to a series of transformations and adjustments before the analytical tools are applied to fit adequate models to the unemployment rate and explain its behavior in terms of relevant factors. Because some types of employment are seasonal or cyclical over time, such as December holiday retail sales or fall farm harvesting, adjustments must often be made to account for such cycles. In fact, throughout a one-year period, the level of unemployment experiences continuous variations because of such seasonal events as changes in weather, major holidays, agricultural harvesting, and school openings and closings. Since seasonal events follow an almost regular periodic pattern each year, their influence on the overall pattern can be easily estimated and eliminated. There are two popular methods for removing seasonality. The first estimates the seasonal component using a regression model with time series errors. The explanatory variables in the regression equation are 12-period harmonic terms. Once the regression coefficients are estimated, the fitted values are evaluated for each month subtracted from the corresponding actual values leading to seasonally adjusted series. The second method consists of simply taking seasonal differences of the unemployment series. The removal of the anticipated seasonal component makes it easier for data analysts to observe fundamental variations in the unemployment level, such as trends, gains, nonseasonal intrinsic cycles, and effects of external events, especially those related to economic factors.

Rate Estimation and Prediction

Since the unemployment survey is conducted in the same manner on a monthly basis, the type of data collected is called “time series data.” Dependence or autocorrelation among the observations in such data is common, which means that most classical mean-variance types of statistical models are not applicable for estimation and prediction with most unemployment data. Mathematical and statistical models that take into account the particularity of time-dependent data are called “time series models.” Among the most popular and useful are autoregressive integrated moving average (ARIMA) models and their seasonal extension (SARIMA). Such models can be used to describe the relationship between a current unemployment rate and past ones using differencing operations and linear equations. As a consequence, the model can also be used to predict future realizations of the unemployment rate. The ARIMA models are very flexible in the sense that they allow for the inclusion of external factors, which can help explain the movement of the unemployment rate and lead to estimators and predictors with smaller variability errors.

Bibliography

Downey, Kirstin. The Woman Behind the New Deal. New York: Anchor Books, 2010.

Flenberg, Stephen. “A Conversation With Janet L. Norwood.” Statistical Science 9, no. 4 (1994).

Pissarides, Christopher. Equilibrium Unemployment Theory. Cambridge, MA: The MIT Press, 2000.

Zbikowski, Andrew, et al. The Current Population Survey: Design and Methodology. Technical Paper 40. Washington, DC: Government Printing Office, 2006.