What Works for Whom? A Bayesian Approach to Channeling Big Data Streams for Policy Analysis
In the coming years, public programs will continuously capture even more and richer data than they do now, including data from web-based tools used by participants in employment services, from tablet-based educational curricula, and from electronic health records for Medicaid beneficiaries, for example. Policy evaluations seeking to take full advantage of the volume and velocity of these data streams will require novel statistical methods. In this paper, we present just such a method, a Bayesian approach to randomized policy evaluations that efficiently estimates heterogeneous treatment effects, identifying what works for whom. The approach enables evaluators to consider multiple candidate interventions simultaneously, matching each study subject with the intervention that is most likely to benefit him or her. The trial design adapts to accumulating evidence: over the course of a trial, more study subjects are allocated to treatment arms that are more promising, given the specific subgroup from which each subject comes. Using a randomized experiment of students in an online course as a motivating example, we conduct a simulation study to identify the conditions under which our Bayesian adaptive design can produce better inference and ultimately smaller trials. In particular, we describe conditions under which there is more than a 90 percent chance that inference from the Bayesian adaptive design is superior to inference from a standard design, using less than one-third the sample size. Under the right circumstances, then, the Bayesian adaptive approach we propose can channel streams of big data to efficiently learn what works for whom.