The Reliability of Graduate Medical Education Quality of Care Clinical Performance Measures
Graduate medical education (GME) program leaders struggle to incorporate quality measures in the ambulatory care setting, leading to knowledge gaps on how to provide feedback to residents and programs. While nationally collected quality of care data are available, their reliability for individual resident learning and for GME program improvement is understudied.
To examine the reliability of the Healthcare Effectiveness Data and Information Set (HEDIS) clinical performance measures in family medicine and internal medicine GME programs and to determine whether HEDIS measures can inform residents and their programs with their quality of care.
From 2014 to 2017, we collected HEDIS measures from 566 residents in 8 family medicine and internal medicine programs under one sponsoring institution. Intraclass correlation was performed to establish patient sample sizes required for 0.70 and 0.80 reliability levels at the resident and program levels. Differences between the patient sample sizes required for reliable measurement and the actual patients cared for by residents were calculated.
The highest reliability levels for residents (0.88) and programs (0.98) were found for the most frequently available HEDIS measure, colorectal cancer screening. At the GME program level, 87.5% of HEDIS measures had sufficient sample sizes for reliable measurement at alpha 0.7 and 75.0% at alpha 0.8. Most resident level measurements were found to be less reliable.
GME programs may reliably evaluate HEDIS performance pooled at the program level, but less so at the resident level due to patient volume.