Bias critically and adversely impacts the assessment of individuals at many stages in the developmental continuum of a physician’s career, including admission to medical school, progression through training, career prospects and advancement, and termination decisions.1,2 Studies have focused primarily on disparities for women, and for racial and ethnic groups, but implicit and explicit bias also negatively affects many other groups underrepresented in medicine (UIM), such as LGBTQIA+ individuals and those with different abilities or from nonpredominant religious groups.3,4 Standardized test scores, clerkship grades, letters of recommendation, honor society memberships, research opportunities, and formative and summative assessments are all traditionally viewed as indicators of proficiency and predictors of future success, yet are also increasingly recognized as vulnerable to bias.5-7 Despite the prevalence of systemic bias in medical education, there are limited published reports of interventions to minimize effects from bias on important outcomes. Similar to other journals, the Journal of Graduate Medical Education (JGME) has received relatively few submissions that examine the extent of a broader, more systemic view of bias in, or promise practices to mitigate bias in, graduate medical education (GME) assessment practices.
Research shows that bias is evident at multiple critical points in medical education, including the initial acceptance decision to medical school, where standardized entrance examinations, like the Medical College Admission Test, disadvantage UIM applicants.8,9 Disparities persist throughout undergraduate medical education, where UIM students frequently receive lower clerkship grades.1,6 Bias has also been identified in narrative assessments, including the Medical Student Performance Evaluation (former Dean’s letter) and letters of recommendation to GME programs.10 Letters written for women and UIM students contain fewer standout adjectives such as “exceptional” or “outstanding” than those written for men and racial majority counterparts.10 During residency, many of these assessment biases persist. Racial and gender bias appears to exist in Accreditation Council for Graduate Medical Education competencies and Milestone achievement, as some studies have found that White residents attain a higher level of Milestone achievement than non-White trainees.11 Similarly, women GME learners receive conflicting feedback regarding autonomy and assertiveness, whereas men residents and fellows receive more constructive feedback, progress through training at a faster pace, and are granted more autonomy than women.11 These disparities can lead to what Teherani and colleagues describe as an “amplification cascade,” in which small differences in assessment accumulate longitudinally and result in enduring disparities throughout later training and a physician’s career.12 Far fewer studies compare assessments of those of different abilities, gender identities, ethnic, or religious backgrounds in comparison to the majority group in a GME program, institution, or specialty. We have little information, aside from opinion pieces and personal essays, as to the perceived and actual effects of bias on assessments and careers. Given the evidence that shows important benefits—to health professionals as well as patients—of enhanced diversity in medicine, ensuring successful career growth for nonmajority individuals should be a priority.
Promising Areas for Future Study
Individual Focus
Adapting some interventions that have been effective in decreasing bias in clinical encounters to the educational environment may prove beneficial in mitigating assessment bias. Many clinical strategies involve implicit bias training, with the hope that if physicians recognize their own biases, there will be a reduction in health care disparities.13 However, social science research has shown that recognizing implicit bias is not sufficient.14 Implicit bias training should also include concrete strategies to reduce bias, such as perspective taking, stereotype replacement, and counterstereotype imaging.13,14 For an assessment example, an attending could utilize the perspective taking strategy when completing an assessment form by imagining themselves as the resident. Professional development exercises could include faculty members role-playing the part of a resident during a simulated feedback session.
Hagiwara et al suggest that minimizing the impact of bias through improved clinician communication may be a more realistic intervention target than reducing implicit bias.15 Studies demonstrate that implicit bias can manifest itself through body language, such as eye contact and body distance, and how one speaks to patients, rather than the content of speech. These strategies might be adapted from the clinician-patient interaction to teaching experiences and be practiced and reinforced. When providing critical feedback, educators could practice substituting more inclusive behavioral choices for negative nonverbal (gestures, eye contact, body stance) and paraverbal (tone pitch and volume of speech) behaviors.15 For example, body postures, such as crossing arms, leaning away from the trainee, or avoiding eye contact might be replaced by more inclusive behaviors, including maintaining an open body posture and comfortable eye contact, and leaning forward slightly to express engagement and interest. In this way, essential feedback might be provided in a manner that is more supportive and respectful of the trainee, and perhaps less influenced by bias. If practiced, over time these strategies might become more natural and automatic. Even when not explicitly linked to assessment, studies could examine whether nonverbal and paraverbal behaviors strengthen the trainee-educator relationship and provide positive role modeling for trainees who will deliver feedback to others.
Promising work contributed by Gonzalez et al for patient care may have applicability to inclusive teaching as well.16 Their findings suggest that, even when patients perceive bias, the outcome of the clinical encounter may still be positive depending upon the physician’s subsequent actions.16 In focus groups with Black and Latinx patients, most participants reported that, after an incident of perceived bias, they most wanted acknowledgement of the biased behavior, followed by an apology. “Restoring the relationship…can lead to the same outcome as never having demonstrated bias in the first place.”16 Similarly, educators can focus on repairing relationships with trainees after instances of perceived bias if open dialog is encouraged. Educators who recognize or are told of perceived bias can apologize to trainees, remain nondefensive, engage in faculty development to learn different approaches, and intentionally practice more inclusive behaviors in the future.
Institutional Focus
Even when methods directed at individuals improve individual behaviors, institutional and organizational changes are likely needed to ensure equity in assessment. It is straightforward to recommend or even mandate GME institutional improvements in reducing assessment bias, but how can this be accomplished through feasible—as well as effective—strategies? There is minimal to no evidence demonstrating how residency programs can routinely evaluate their assessment modalities and practices for potential inequities among subgroups of learners and use continuous quality improvement methods to address bias.17 For example, the Canadian GME Competency by Design programmatic assessment initiative has had mixed results.18-20 Faculty could be trained in how to use standardized rubrics or other tools that are criterion referenced, competency-based, and nonnormative, but it is unclear how to accomplish this, consistently over time, in a sustained manner.17 Furthermore, educators need to focus their assessments on direct observation of authentic work-based skills, such as entrustable professional activities, but studies report many barriers to direct observation assessments.20,21 Some programs have had success in inserting frequent, competency-based, and directly observed assessments, with variable acceptance by trainees and faculty; these innovations could be studied in other settings and programs.22 Recommendations to improve assessment also include that assessors “slow” down when assessing learners.23 Bias is more likely in stressful settings with time pressures and fatigue.23 Studies that feasibly introduce methods to reduce faculty stress, perhaps through teaching deceleration strategies—eg, taking a deep breath and centering oneself before assessment—could be a time-efficient strategy to study, through comparing assessments done with and without this brief maneuver.23 Artificial intelligence and natural language models are also beginning to be used to aggregate assessments, but how these can best be adapted to eliminate or mitigate bias is yet unknown.24 Kiyasseh and colleagues address the elephant in the room and the question to which, right now, we have no answer: “how much bias mitigation is sufficient.”25
Conclusion
Bias in medical education assessment endures despite enhanced awareness. For those committed to reducing its influence and enhancing careers for diverse trainees, there are some individual and institutional approaches that need to be studied and then disseminated for GME contexts. Single interventions will generally not work or not be sustained, and bias might paradoxically be increased.26 There are insufficient studies overall and their findings are mixed, with disagreements regarding which strategies have merit. Multiple longitudinal interventions are likely to be the most effective but will be difficult and expensive to study. Gonzalez and colleagues’ suggestion that “implicit bias recognition and management must be reframed as an epistemology of practice…essential to the professional identity of medical learners to be effective”16 may provide a useful, perhaps inspirational construct as we consider next steps in assessment bias. JGME welcomes your thoughts on this important topic.