Variations in How Medical Researchers Report Patient Demographics: A Retrospective Analysis of Published Articles

Variations in How Medical Researchers Report Patient Demographics: A Retrospective Analysis of Published Articles

Published: Apr 01, 2023
Publisher: eClinical Medicine, Part of The Lancet Discovery Science, vol. 58

Erika E. Lynn-Green

Avery A. Ofoje

Robert H. Lynn-Green

David S. Jones


The use of demographic variables in the medical literature has been a topic of much recent debate. Recent studies found that race and socioeconomic status (SES) are inconsistently reported. Best-practice use of sex and gender has been contentious. We aimed to characterise the state of medical demographic reporting in greater detail, especially regarding geography and specific terms used in articles.


Original articles were included from issues of the New England Journal of Medicine (NEJM), JAMA, The Lancet, and the American Journal of Epidemiology (AJE) published from 1 January to 31 December 2020 (n = 640). Articles without human participants, case reports, or with only aggregate data were excluded, leaving 594 articles. Use of age, sex, gender, race, ethnicity, and SES were coded, as well as corresponding author and participant geography.


99.0% of articles reported age. While 92.9% reported sex alone, only 4.7% used the term gender and 1.0% transgender. 47.8% of articles reported race and 29.6% reported ethnicity. Studies with U.S. corresponding authors or participants were significantly more likely to report race (72.9% and 73.7% respectively) or ethnicity (47.3% and 45.3% respectively) than those without (25.9% and 25.6% for race, 14.2% and 16.3% for ethnicity), p <& 0.01 for all. Of articles reporting race, 40.9% used only a Black-white binary; of those reporting ethnicity, 85.2% included two or fewer terms. Under 5.0% of all articles used Office of Management and Budget (OMB) categories. Across all articles, 33.0% reported SES, from 15.2% in NEJM to 80.2% in AJE.


We found that while some factors (age, sex) are reported consistently, others (gender, race, ethnicity, SES) are not, despite recent attention. Authors often rely on binary or limited categories that inadequately capture human diversity. The presence of U.S. researchers or participants increased the reporting of race and/or ethnicity, highlighting wide variations that persist even as multinational collaborations become widespread. Researchers should reflect on their use of these terms, justify their decisions, and report results with care.

How do you apply evidence?

Take our quick four-question survey to help us curate evidence and insights that serve you.

Take our survey