Small Area Estimates Can Deliver Big Advances in Policy Analysis

Small Area Estimates Can Deliver Big Advances in Policy Analysis

Sep 16, 2016
John Czajka
Czajka blog quote box

One of my first projects for Mathematica in the early 1980s was to estimate food stamp participation rates by race and ethnicity for every U.S. county. Participation rates provide a picture of how well the program is serving those in need. I obtained county-level administrative counts of participants by race and ethnicity—the numerator in a participation rate. The challenge was to estimate the denominator—that is, the number of people who were eligible to receive food stamps.

Direct census or survey estimates were not the answer. Useful data from the 1980 census were not yet available, and the 1976 Survey of Income and Education had too small a sample to support precise estimates for areas other than states and large metropolitan regions. To find a solution, I had to look elsewhere. Applying a methodology presented in the doctoral dissertation of Noel Purcell, a graduate student at the University of Michigan, I adjusted outdated county-level estimates from the 1970 census to match more current state-level data derived from multiple surveys and a microsimulation model.


Czajka quote box

This was my first foray into an area of statistical research known as small area estimation, a key tool for developing more accurate information for policymaking worldwide. Small area estimation has progressed greatly in the decades since I first used it, aided by increasingly powerful computers and methodologies designed to take advantage of the enhanced computing capacity. And it remains an exciting area of development within the statistical research community, as practitioners continue to explore ways to improve its effectiveness. 

Small area estimation addresses a problem that policy researchers often face—needing estimates for geographic units smaller than our survey data will support. The survey estimates for those domains may be too imprecise for our purposes, or we may not be able to identify many of the domains of interest in the survey data. Administrative data may supply plenty of precision, but they do not always address the research question we are trying to answer. 

Encompassing a variety of methodologies, small area estimation typically involves using a model to combine data from multiple sources—such as administrative records, a prior census, or a much larger survey. The goal is to improve upon the precision of the direct estimates by “borrowing strength” from these other sources.

I recently attended a global conference on small area estimation, hosted by Maastricht University in the Netherlands, where I presented a paper summarizing the small area estimates produced by the U.S. government. Examples include state-, county-, and school district-level median income and poverty (from the Census Bureau); state estimates of substance use and mental disorders (from the Substance Abuse and Mental Health Services Administration); and county estimates of diabetes prevalence, incidence, and risk factors (from the Centers for Disease Control and Prevention). Mathematica produces annual state-level estimates of participation rates for the Supplemental Nutrition Assistance Program for the Food and Nutrition Service.  

At the conference, researchers from around the world explored challenges such as missing data, measurement error, and model misspecification; discussed advances in methodology, particularly Bayesian approaches; and illustrated a wide range of applications, focusing on estimates of poverty and health. 

Judging from the number of participants at the Maastricht conference and the breadth of their work, small area estimation remains a vital tool for statistical research. It promises to improve the quality of analytical data below the national level and to support decision making in many areas that can enhance the well-being of people around the globe.

Learn more about Mathematica’s survey design and data collection work.

About the Author