Every day, machine learning algorithms become a bigger part of our lives. They give us targeted ads on social media, facial recognition software on our phones, and suggestions for new TV shows to watch. They’re also becoming more common in public policy, from informing decisions about child welfare to pretrial bail reform to criminal sentencing.
Through increased computing power, algorithms hold the promise of helping humans tap into deep reservoirs of data to make better decisions based on more accurate, evidence-based predictions. But despite their potential, policymakers and researchers are still learning when to use algorithms and how to avoid unintended consequences, such as exacerbating racial or social inequities in policies and programs. A recent study in Pittsburgh sheds light on the limits and opportunities of using algorithms to target resources for students who the data indicate are likely to experience academic problems in the near future.
Last year, as part of its work for the Regional Educational Laboratory Mid-Atlantic, a research center funded by the U.S. Department of Education, Mathematica developed a machine learning algorithm to identify students in Pittsburgh Public Schools who are at risk of having near-term academic problems. For the study, we looked specifically at absenteeism, suspensions, poor grades, course failure, and low performance on state tests. By accurately identifying students at risk of experiencing academic problems in the next quarter or semester, districts can target services for these students to prevent problems before they occur and lead to more serious consequences, such as dropping out of school.
In a follow-up study that was published this month, Mathematica and Pittsburgh Public Schools examined whether the new machine learning algorithm was more accurate than the simple early warning system that the school district uses now. Importantly, the simple early warning system in place in Pittsburgh is similar to the system that many school districts use, so the results have implications for school districts across the country:
- Current approach: a simple early warning system based on prior performance. Currently, if a student in the Pittsburgh school district is chronically absent, has a low grade point average (GPA), fails a course, or is suspended in the first semester, the district’s simple early warning system predicts that the student will experience the same academic problem in the second semester. This system only considers each academic problem individually and produces a yes/no prediction whether the student will have an academic problem.
- New approach: a machine learning algorithm based on many predictors. Under the new approach, the district considers all of the academic problems when making a prediction. For example, it considers whether the student was chronically absent in the prior quarter when predicting whether the student will earn a low GPA in the next quarter. The algorithm also considers demographic characteristics as well as data on use of social services by the student or the student’s family. The algorithm then generates for each student a risk score from 0 to 1 that represents the likelihood that student would have an academic problem. For example, a risk score of .82 means there is an 82 percent chance the student will experience an academic problem.
We found that both approaches are similarly accurate. In other words, the additional data incorporated by the algorithm did not help the district make better predictions. This finding confirms that a student’s prior performance is the most important predictor of future behavior. Many districts may understandably conclude that the relative accuracy of their current simple early warning system warrants no change in approach. Indeed, because both approaches are similarly accurate, districts might want to consider other criteria, such as resource constraints and logistical demands, when deciding which approach to adopt.
The simple early warning system requires less data and is cheaper to set up and maintain, but its binary prediction can be limiting for districts. For common academic problems, a simple early warning system will predict a large percentage of students will have an academic problem, often creating a mismatch between the students identified and the resources available to serve those students. To take a hypothetical example, say the simple early warning system identifies 30 percent of students to be at risk for chronic absenteeism. That information might not be especially helpful to a district that can only provide extra support to 10 percent of students. With the simple early warning system, the district wouldn’t have a way to identify the most at-risk students within the larger group.
Because students can be ranked by the machine learning system’s risk score, it is much more flexible and useful for districts. In the example above, a district could target services for the 10 percent of students most at risk of being chronically absent. And if the level of available resources changes the next semester, the school could adjust the number of students targeted for services while continuing to use the risk scores as a guide for maximizing the efficiency and effectiveness of those services. The ability to develop a data-driven strategy that is compatible with a district’s level of resources is a huge advantage of the machine learning algorithm.
I mentioned at the top that policymakers and researchers are still learning how to tap into the benefits of algorithms without introducing or exacerbating inequities. Both algorithms themselves and the data they use can be biased. Researchers are still exploring different ways to assess algorithmic bias, and there are not set standards of fairness, partially because the context of each algorithm matters. Although we did not conduct a full fairness assessment in the most recent study, we did look at the algorithm’s accuracy among different racial and ethnic groups. We found both the current simple warning system and the algorithm were less accurate in the predictions for Black students than for White students. This does not mean the current flags by Pittsburgh Public Schools or the predictive algorithms are biased against Black students; it only means that outcomes for Black students are harder to predict from existing data. This finding suggests districts and researchers should investigate whether they are missing key data related to outcomes for Black students. Including additional data in future machine learning algorithms may improve their accuracy.
The study in Pittsburgh underscores the importance of assessing machine learning algorithms before moving forward with implementation. The findings suggest that there are, indeed, benefits in adopting a different approach with more advanced technology, but they come with drawbacks, too. In the case of predicting near-term academic risks, school districts will likely need to weigh the benefits of being able to target services for students most in need against the steeper financial and logistical requirements for using such a tool. More broadly, the study in Pittsburgh demonstrates that adopting machine learning algorithms to inform policy decisions may come with programmatic, financial, and logistical trade-offs, a lesson that undoubtedly applies in other policy contexts as well.