Background

Studies on components of residency applications have shown evidence of racial bias. The Standardized Letter of Evaluation (SLOE) is an assessment measure for emergency medicine (EM) residency applications and, as more specialties opt to use SLOEs in place of narrative letters of recommendation, understanding bias on standardized assessments is essential.

Objective

To determine whether there is a difference in rankings on the EM SLOE between underrepresented in medicine (UIM) and non-UIM applicants, White and non-White applicants, and to examine whether differences persist after controlling for other characteristics.

Methods

The sample was drawn from medical students who applied to EM residency at the study institution in 2019. We compared rankings between UIM and non-UIM students and between students of each individual race/ethnicity and White students, after controlling for United States Medical Licensing Examination Step scores, Alpha Omega Alpha status, type of school (US MD, US DO, internation medical graduate), Medical Student Performance Evaluation class percentile, affiliated program vs visiting clerkship SLOE, gender and the interaction of race/ethnicity and gender, and adjusted for students submitting multiple SLOEs, using ordinal regression.

Results

There were 1555 applicants to the study institution in 2019; 1418 (91.2%) had a SLOE and self-identified race/ethnicity. After controlling for applicant characteristics, non-UIM students were significantly more likely to be ranked higher than UIM students on “Rank Against Peers,” (OR 1.46, 95% CI 1.03-2.07) and Grade (OR 1.46, 95% CI 1.05-2.04).

Conclusions

Analysis of EM SLOEs submitted to our institution demonstrates racial bias on this standardized assessment tool, which persists after controlling for other performance predictors.

Objectives

To determine whether there are difference in rankings on the emergency medicine Standardized Letter of Evaluation (SLOE) by race.

Findings

After controlling for applicant characteristics, the emergency medicine SLOE demonstrates significant differences in rankings by race.

Limitations

This was a convencience sample of standardized letters submitted to one residency during one application season.

Bottom Line

Standardized letters demonstrate similar racial bias to other assessment methods and residency program directors need to be aware of these limitations when assessing residency applicants.

Racial discrimination and implicit racial bias are widespread throughout medical education, resulting in disparities in assessment measures and the residency Match.1-6  Studies demonstrate evidence of racial/ethnic bias in grading with an association between lower clerkship grades and non-White race/ethnicity,2  and significant systematic differences exist in the language used to describe White vs Black applicants on the Medical Student Performance Evaluation (MSPE).3  Further, social determinants of learning can disproportionately and negatively influence the standardized test scores of underrepresented in medicine (UIM) students.7,8 

Importantly, the impact of racial/ethnic disparities in assessment have created inequities in the residency Match. Studies have shown that a higher proportion of Black5  and UIM students6  are denied residency interviews compared to White or non-UIM students when using a minimum cutoff score for United States Medical Licensing Examination (USMLE) Step 1. As Step 1 transitions to pass/fail grading and more specialties develop a Standardized Letter of Evaluation (SLOE) to assess residency applicants, the effect of racial/ethnic bias on the SLOE and the potential for exacerbating racial/ethnic inequities in the Match must be considered.

In 1995, the emergency medicine (EM) SLOE was created to provide a more standardized and less biased assessment of medical students' clerkship performance.9  It consists of the following (see online supplementary data for an example SLOE):

  1. “Rank Against Peers”: students are ranked against all other students applying to EM residency who were assessed by the SLOE author

  2. Predicted placement on the institution's Match list

  3. Grade on the EM rotation on which the SLOE is based

  4. Qualities necessary for success in EM, ranked against peers

  5. Narrative portion

The EM SLOE has become the component of an application that program directors value most when selecting students to interview and rank.10-12  Other specialties (including otolaryngology, dermatology, orthopedics, and obstetrics and gynecology) have adopted a SLOE for their residency selection process, and the Coalition for Physician Accountability (COPA) recommends that all specialties cease using a narrative letter of recommendation in favor of a standardized evaluation letter.13  Understanding the influence of racial/ethnic bias on the EM SLOE could have broad implications for applicants across multiple specialties.

A subgroup analysis of a recent study comparing the EM SLOE to the Standardized Video Interview found that rankings on the EM SLOE “slightly favored White applicants”14 ; however, no study has specifically examined racial/ethnic bias in the EM SLOE. The primary aim of this study is to determine whether there is a difference in rankings on the EM SLOE between UIM and non-UIM applicants and to examine whether differences persist after controlling for other factors in the application. The secondary aim is to determine whether there are SLOE ranking differences between non-Hispanic White and non-White applicants.

Setting and Participants

This is a retrospective quantitative document review study. The sample was drawn from the students who applied to the study institution's EM residency program (a Midwest, urban, university-based, 3-year program with 18 residents per class) in 2019, including all US MD, US DO, and international medical graduate (IMG) applicants. This represents 39% of the total EM applicant pool15  and 55% of the US MD EM applicant pool in 2019.15 

Interventions

We obtained data from the Electronic Residency Application Service (ERAS) file of each student. SLOE rankings, student race/ethnicity, type of school (US MD/US DO/IMG), gender, USMLE Step 1 score, affiliated program vs visiting clerkship SLOE, Alpha Omega Alpha (AOA) status, and MSPE class percentile were paired and de-identified. Students self-identify their race/ethnicity on ERAS, with the following options: “American Indian or Alaska Native,” “Asian,” “Black or African American” (Black), “Hispanic, Latino, or of Spanish Origin” (Hispanic), “Native Hawaiian or Other Pacific Islander,” “White,” “Other,” or “Unknown.” Students may select as many as needed, or leave it blank.

Outcomes Measured

The 3 main variables on the SLOE are:

  1. Rank Against Peers (RAP), ranging from top 10%, top 1/3, middle 1/3, lower 1/3

  2. Rank List Prediction (RLP), ranging from top 10%, top 1/3, middle 1/3, lower 1/3, unlikely to rank

  3. Grade, which generally ranges from Honors, High Pass, Pass, Fail

The RAP and Grade are 4-point ordinal scales; the RLP is a 5-point ordinal scale. A ranking of one correlates with top 10% on RAP and RLP, and Honors on Grade.

Analysis of Outcomes

The primary outcome is a comparison of the RAP, RLP, and Grade between UIM (defined as students who identified as Black, Hispanic, American Indian/Alaska Native, and/or Native Hawaiian/Pacific Islander on their application)8  and non-UIM students, using the Wilcoxon rank sum test. Using ordinal regression, we controlled for USMLE Step scores, AOA status, type of school (US MD, US DO, IMG), MSPE class percentile, affiliated program vs visiting clerkship SLOE, gender and the interaction of race/ethnicity and gender, and adjusted for students submitting multiple SLOEs (students generally obtain at least 2 SLOEs, one from their affiliated program clerkship and one after a visiting clerkship, with some students completing multiple visiting clerkships). Because MSPE class percentiles are variable between schools (quartile, tertile, etc), percentiles were converted to an ordinal ranking with one being the top ranking.

The secondary outcome is a comparison of the RAP, RLP, and Grade between non-Hispanic White (White) students and Black, Asian, and Hispanic students, using the Wilcoxon rank sum test. Ordinal regression was used to control for the above performance variables. Statistics were performed using R Studio (RStudio, Boston, MA).

This study received Insitutional Review Board exemption from the University of Chicago and the University of Illinois at Chicago.

In 2019, 1555 students applied to the study institution's EM residency program. Of these, 1493 applicants submitted at least one SLOE and, of those, 1418 students self-identified their race/ethnicity. In all, 3515 SLOEs were available for analysis from applicants who self-identified their race/ethnicity.

For each outcome variable, we assessed only the SLOEs in which that variable was assigned. There were 3507 SLOEs assigning a RAP, 3389 assigning an RLP, and 3041 that assigned a Grade and were not graded as pass/fail (SLOEs graded as pass/fail were not included in the grade analysis, as only 13% of SLOEs were graded pass/fail and, of those, 100% of students received a pass).

Table 1 represents the demographics of the applicants in our sample who had at least one SLOE and self-identified their race/ethnicity and the demographics of the national EM applicant population for 2019.

Table 1

Applicant Demographics

Applicant Demographics
Applicant Demographics

Rank Against Peers

The median rank for RAP for the cohort was 2 (interquartile range [IQR] 2-3). Table 2 represents the percent of SLOEs with each rank by UIM status and race/ethnicity.

Table 2

Rank Against Peers—Distribution of Rankings by Race/Ethnicity

Rank Against Peers—Distribution of Rankings by Race/Ethnicity
Rank Against Peers—Distribution of Rankings by Race/Ethnicity

SLOEs from UIM students had significantly lower rankings on the RAP section of the SLOE compared with SLOEs from non-UIM students (P<.05). After controlling for gender, the interaction of race/ethnicity and gender, MSPE class percentile, AOA status, type of school, affiliated program vs visiting clerkship SLOE, and Step 1 score, non-UIM students were significantly more likely to be ranked higher than UIM students (OR 1.46, 95% CI 1.03-2.07).

SLOEs from Asian, Black, and Hispanic students had significantly lower rankings on the RAP section compared with SLOEs from White students (P<.05). After controlling for the predictors above, White students were significantly more likely to be ranked higher than Asian students (OR 1.46, 95% CI 1.00-2.13) and Black students (OR 1.81, 95% CI 1.15-2.83), respectively. There was no difference between White and Hispanic student rankings after controlling for the predictors.

Rank List Prediction

The median rank for RLP for the cohort was 2 (IQR 2-3). Table 3 represents the percentage of SLOEs with each rank by UIM status and race/ethnicity.

Table 3

Rank List Prediction—Distribution of Rankings by Race/Ethnicity

Rank List Prediction—Distribution of Rankings by Race/Ethnicity
Rank List Prediction—Distribution of Rankings by Race/Ethnicity

SLOEs from UIM students had significantly lower rankings on the Predicted Match section of the SLOE compared with SLOEs from non-UIM students (P<.05). After controlling for gender, MSPE class percentile, AOA status, type of school, and affiliated program vs visiting clerkship SLOE, non-UIM students were significantly more likely to be ranked higher than UIM students (OR 1.30, 95% CI 1.04-1.62). After controlling for Step 1 score and the interaction of race/ethnicity and gender, there was no difference between non-UIM and UIM rankings.

SLOEs from Asian, Black, and Hispanic students had significantly lower rankings on the Predicted Match section compared with SLOEs from White students (P<.05). After controlling for the above predictors, White students were significantly more likely to be ranked higher than Asian students (OR 1.49, 95% CI 1.04-2.15). After controlling for gender, MSPE class percentile, AOA status, type of school, affiliated program vs visiting clerkship SLOE, and Step 1 score, White students were significantly more likely to be ranked higher than Black students (OR 1.41, 95% CI 1.12-1.78). After controlling for the interaction of race/ethnicity and gender, there was no difference between White and Black student rankings. There was no difference in rankings between Hispanic and White students after controlling for predictors.

Grade

The median ranking for Grade for the cohort was 2 (IQR 1-2). Table 4 represents the percentage of SLOEs with each Grade by UIM status and race/ethnicity.

Table 4

Grade—Distribution of Rankings by Race/Ethnicity

Grade—Distribution of Rankings by Race/Ethnicity
Grade—Distribution of Rankings by Race/Ethnicity

SLOEs from UIM students had significantly lower rankings on the Grade section of the SLOE compared with SLOEs from non-UIM students (P<.05). After controlling for gender, the interaction of race/ethnicity and gender, MSPE class percentile, AOA status, type of school, affiliated program vs visiting clerkship SLOE, and Step 1 score, non-UIM students were significantly more likely to rank higher than UIM students (OR 1.46, 95% CI 1.05-2.04).

SLOEs from Asian, Black, and Hispanic students had significantly lower rankings on the Grade section compared with SLOEs from White students (P<.05). After controlling for the above predictors, White students were significantly more likely to rank higher than Black students (OR 1.65, 95% CI 1.08-2.52). There is no difference in grades between Asian or Hispanic students and White students after controlling for these predictors.

Rankings on the EM SLOE were lower for UIM students compared to non-UIM students, Black students compared to White students, Asian students compared to White students, and Hispanic students compared to White students across all studied measures on the SLOE. Lower rankings for UIM students compared to non-UIM students on RAP and Grade persisted even after controlling for other factors, including the MSPE class percentile and Step 1 score. Further, when controlling for other factors, Asian and Black students were ranked lower than White students on RAP, Asian students were ranked lower than White students on RLP, and Black students received lower grades than White students. These results demonstrate racial/ethnic bias on the EM SLOE, for both UIM students compared to non-UIM students, and Asian, Black, and Hispanic students compared to White students.

Our results are consistent with the racial/ethnic bias found in other forms of medical student assessment and add to the literature in 2 ways. First, we found that disparities in rankings by race/ethnicity persist on the EM SLOE after controlling for multiple measures of competency, suggesting that these disparities cannot be attributed to differences in clinical performance. Second, previous studies demonstrating racial/ethnic bias examined assessments in which there is a wide variability between schools.2,4  Our study shows that similar racial/ethnic bias is present on clinical assessments that are standardized throughout the country. The recent statement from COPA lists “the presence of individual and systemic bias” as one reason for all specialties to adopt a SLOE.13  This study demonstrates that simply making an assessment standardized does not eliminate racial/ethnic bias and further action via a systemic, anti-racist strategy is necessary.

One way to approach this problem is through a recently published framework to address systemic change through an anti-racist lens: See, Name, Understand, Act.16  The results from our study allow us to “See” the problem. Program and clerkship directors, in all specialties using a SLOE, need to acknowledge the limitations of a SLOE in assessing students of different racial/ethnic backgrounds and the potential to exacerbate racial/ethnic inequity in the Match by placing too much emphasis on a specific metric rather than conducting holistic review.17,18 

There are several limitations to this study. First, this study was conducted by convenience sample, not randomization, therefore there may be selection bias. The SLOEs studied were submitted to a single institution, therefore results may not be generalizable to the entire applicant pool. Specifically, our sample weighs heavily toward US MD applicants and underrepresents US DO and IMG applicants. Second, we only analyzed SLOEs that were submitted in applications, thus SLOEs that students purposely did not submit, due to perceived bias or other reasons, were not included. Third, due to the low number of students identifying as American Indian/Alaska Native or Native Hawaiian/Pacific Islander, we did not have the power to include them in the White vs non-White analysis. They were, however, included within the UIM group for the UIM vs non-UIM group analysis. Finally, the race/ethnicity of the SLOE author may also contribute to the presence or absence of bias. Although 68% of the US EM workforce identifies as White,19  to our knowledge no data exist describing the race/ethnicity of SLOE authors, and authors do not self-identify their race/ethnicity when creating a SLOE.

Future work can be directed by the rest of the anti-racist framework—“Name,” “Understand,” and “Act.” Leaders in education need to “Name” the problem. The consistency of findings across many different assessment measures in medical education demonstrate that the identified racial/ethnic disparities are not isolated to specific situations, institutions, or assessment tools. Education leaders must be willing to name the more insidious factors leading to assessment disparities by race/ethnicity across the spectrum of assessment, including systemic racial/ethnic inequalities and social determinants of education.7,20  Next, we need to further “Understand” the problem to propose effective solutions. The structure of the SLOE may contribute to racial bias and represents an opportunity to further understand how changes to an assessment tool can affect equity. The current norm-referenced “Rank Against Peers” likely introduces bias into the assessment that could be mitigated by changing to a criterion-based assessment. Recent work suggests that using a “deficit-based” approach to assessment (such as putting students into a lower third ranking) compared to utilizing a competency-based approach to assessment may “disproportionately disadvantage UIM learners.”21  Additionally, literature suggests that any assessment utilizing a comparison to peers is inherently inequitable.22  Changing to a criterion-based system would require writers to anchor their assessments of students to specific competency descriptors and reduce subjectivity. Finally, these findings should be a call for further immediate “Action,” including using group SLOEs with diverse faculty representation21  and committing to exploring solutions to the pervasive racial bias present in medical student assessment uncovered by us and others.1-8  As multiple specialties currently use and more continue to adopt the SLOE, this will not be a problem limited to EM.

Rankings on EM SLOEs submitted to the study institution demonstrate disparities by race/ethnicity after controlling for other measures of competency and achievement.

1.
Wijesekera
TP
,
Kim
M
,
Moore
EZ
,
Sorenson
O
,
Ross
DA
.
All other things being equal: exploring racial and gender disparities in medical school honor society induction
.
Acad Med
.
2019
;
94
(4)
:
562
-
569
.
2.
Lee
KB
,
Vaishnavi
SN
,
Lau
SK
,
Andriole
DA
,
Jeffe
DB
.
“Making the grade:” noncognitive predictors of medical students' clinical clerkship grades
.
J Natl Med Assoc
.
2007
;
99
(10)
:
1138
-
1150
.
3.
Ross
DA
,
Boatright
D
,
Nunez-Smith
M
,
Jordan
A
,
Chekroud
A
,
Moore
EZ
.
Differences in words used to describe racial and gender groups in Medical Student Performance Evaluations
.
PLoS One
.
2017
;
12
(8)
:
e0181659
.
4.
Boatright
D
,
Ross
D
,
O'Connor
P
,
Moore
E
,
Nunez-Smith
M
.
Racial disparities in medical student membership in the Alpha Omega Alpha Honor Society
.
JAMA Intern Med
.
2017
;
177
(5)
:
659
-
665
.
5.
Edmond
MB
,
Deschenes
JL
,
Eckler
M
,
Wenzel
RP
.
Racial bias in using USMLE Step 1 scores to grant internal medicine residency interviews
.
Acad Med
.
2001
;
76
(12)
:
1253
-
1256
.
6.
Spector
AR
,
Railey
KM
.
Reducing reliance on test scores reduces racial bias in neurology residency recruitment
.
J Natl Med Assoc
.
2019
;
111
(5)
:
471
-
474
.
7.
Muller
D
,
Hurtado
A
,
Cunningham
T
, et al
Social determinants, risk factors, and needs: a new paradigm for medical education
.
Acad Med
.
2022
;
97
(suppl 3)
:
12
-
18
.
8.
McDade
W
,
Vela
MB
,
Sánchez
JP
.
Anticipating the impact of the USMLE Step 1 pass/fail scoring decision on underrepresented-in-medicine students
.
Acad Med
.
2020
;
95
(9)
:
1318
-
1321
.
9.
Keim
SM
,
Rein
JA
,
Chisholm
C
, et al
A standardized letter of recommendation for residency application
.
Acad Emerg Med
.
1999
;
6
(11)
:
1141
-
1146
.
10.
Breyer
MJ
,
Sadosty
A
,
Biros
M
.
Factors affecting candidate placement on an emergency medicine residency program's rank order list
.
West J Emerg Med
.
2012
;
13
(6)
:
458
-
462
.
11.
Love
JN
,
Smith
J
,
Weizberg
M
, et al
Council of Emergency Medicine Residency Directors' standardized letter of recommendation: the program director's perspective
.
Acad Emerg Med
.
2014
;
21
(6)
:
680
-
687
.
12.
Negaard
M
,
Assimacopoulos
E
,
Harland
K
,
Van Heukelom
J
.
Emergency medicine residency selection criteria: an update and comparison
.
AEM Educ Train
.
2018
;
2
(2)
:
146
-
153
.
13.
Coalition for Physician Accountability. Initial Summary Report and Preliminary Recommendations of the Undergraduate Medical Education to Graduate Medical Education Review Committee. Accessed July 13,
2022
.
14.
Hopson
LR
,
Regan
L
,
Bond
MC
, et al
The AAMC standardized video interview and the electronic standardized letter of evaluation in emergency medicine: a comparison of performance characteristics
.
Acad Med
.
2019
;
94
(10)
:
1513
-
1521
.
15.
Association of American Medical Colleges
.
ERAS Statistics: Emergency Medicine. Accessed August 6,
2021
.
16.
Solomon
SR
,
Atalay
AJ
,
Osman
NY
.
Diversity is not enough: advancing a framework for antiracism in medical education
.
Acad Med
.
2021
;
96
(11)
:
1513
-
1517
.
17.
Katsufrakis
PJ
,
Uhler
TA
,
Jones
LD
.
The residency application process: pursuing improved outcomes through better understanding of the issues
.
Acad Med
.
2016
;
91
(11)
:
1483
-
1487
.
18.
Pope
AJ
,
Carter
K
,
Ahn
J
.
A renewed call for a more equitable and holistic review of residency applications in the era of COVID-19
.
AEM Educ Train
.
2021
;
5
(1)
:
135
-
138
.
19.
Association of American Medical Colleges
.
Diversity in Medicine: Facts and Figures 2019
.
20.
Evans
MK
,
Rosenbaum
L
,
Malina
D
,
Morrissey
S
,
Rubin
EJ
.
Diagnosing and treating systemic racism
.
N Engl J Med
.
2020
;
383
(3)
:
274
-
276
.
21.
Lucey
CR
,
Hauer
KE
,
Boatright
D
,
Fernandez
A
.
Medical education's wicked problem: achieving equity in assessment for medical learners
.
Acad Med
.
2020
;
95
(suppl 12)
:
98
-
108
.
22.
Teherani
A
,
Perez
S
,
Muller-Juge
V
,
Lupton
K
,
Hauer
KE
.
A narrative study of equity in clinical assessment through the antideficit lens
.
Acad Med
.
2020
;
95
(suppl 12)
:
121
-
130
.

Author notes

Editor's Note: The online version of this article contains an example Standardized Letter of Evaluation.

Funding: The authors report no external funding source for this study.

Competing Interests

Conflict of interest: The authors declare they have no competing interests.

Supplementary data