Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set
U.S. Department of Health and Human Services, Administration for Children and Families, Office of Planning, Research, and Evaluation
- When the desired estimates are simple univariate descriptive statistics, single imputation techniques such as mean replacement can perform as well as more complicated techniques such as multiple imputation.
Given the intensive demands that collecting attendance data places on program staff, it can often be challenging to collect and might result in a fair amount of missing data, which can compromise the reliability and validity of attendance estimates. Little is known about which methods for handing missing data generate the most accurate estimates of attendance. To address this issue, we simulated data on children’s weekly child care center attendance over the course of a year and compared different methods of estimating attendance. The results indicate that when data are missing on one variable and at one level only, complete case analysis produces accurate estimates of average weekly attendance, regardless of the amount or type of missingness. When estimating total yearly attendance, complete case analysis is inaccurate, but both mean replacement and multiple imputation produce reasonable estimates. A lesson learned from this exercise is that when the desired estimates are simple univariate descriptive statistics, single imputation techniques such as mean replacement can perform as well as more complicated techniques such as multiple imputation.