In this issue of the Journal of Graduate Medical Education, Vokes et al analyze grade distributions for core clerkships among student applicants to orthopedic surgery from 86 of 133 Association of American Medical Colleges (AAMC) medical schools in the 2017 Match cycle.1  To be included, grade distributions and a standard grading system needed to be provided within application materials. Within the surgery clerkships of these schools, the median rate of “honors” was 32.5%, with a wide range from 5% to 67%. Similar rates of high inter-institutional variability were found in all clerkships. This is the latest addition to the literature demonstrating the inadequacy of current medical school assessments to be used to reliably assess candidates for residency.

Meanwhile, application volume continues to stress the graduate medical education (GME) community. In 2019, 47 012 applicants submitted an average of 92.0 Electronic Residency Application Service (ERAS) applications to residency training programs.2  Compared to 2015, this represents a 6.6% increase in applicants and a 14% increase in applications per applicant. Each specialty has experienced different trends. While applicants to orthopedics have remained stable in that same period, each applicant submitted 20% more applications. National Resident Matching Program (NRMP) data from 2019 showed that 33 875 available postgraduate year 1 positions were filled through the Match. Some interesting trends were observed in this year. The percentage of all US allopathic seniors who matched to their first choice program was 47.1% (the lowest on record), and there was higher participation among applicants from US osteopathic medical schools as a result of the single accreditation system (an all-time high of 6001 osteopathic candidates in the Match).3 

Faced with this onslaught of applications, most program directors continue to rely on a handful of traditional assessments for initial review. The NRMP periodically surveys program directors to ascertain the importance of various factors in applicant selection. In their most recent survey from 2018, the top 6 factors cited in selecting applicants to interview were United States Medical Licensing Examination/Comprehensive Osteopathic Medical Licensing Examination of the United States (USMLE/COMLEX-USA) Step 1 scores, specialty-specific letters of recommendation, Medical Student Performance Evaluations (MSPEs), USMLE/COMLEX-USA Step 2 scores, personal statements, and grades in required clerkships.4  A total of 70% of program directors cited clerkship grades as a factor in offering interview slots. When ranking applicants for the Match, its importance dropped to 54% and the 17th most important factor.

So why are clerkship grades felt to be less important than other subjective and objective data? AAMC data from 2019 showed that only 12 Liaison Committee on Medical Education (LCME) schools use a strictly pass-fail scoring mechanism for required clinical clerkships, while 149 do not.5  Grade data are not designed to report selectivity or rigor of the medical school. Rather, grades are meant to measure performance of students relative to others at that institution as well as measure adequacy of instruction. Furthermore, each institution weighs various components of grades (clinical performance, written examination, observed clinical examinations, etc) differently, thereby adding to the variability. As Vokes et al have shown, they are unreliable and lack validity in allowing comparison of students from different institutions.

Meanwhile, MSPEs remain an important component of an applicant's application. A 2017 analysis of MSPEs following the 2017 MSPE recommendations from the AAMC showed that, while these documents have become more standardized and transparent with regard to medical student evaluation, only 69.9% of schools reported school-wide summative performance data.6  As such, there is still work to be done to standardize this document.

To help address these issues, changes are underway for some of the other most-trusted documents in an applicant's portfolio. In March 2019, the National Board of Medical Examiners convened the Invitational Conference on USMLE Scoring to discuss the use and reporting of USMLE scores.7  While no substantive changes were recommended, one aspirational goal from the report stands out: “The current system for residency application does not provide program directors with sufficient options for combining and sorting on multiple domains. Building additional sorting/analytic tools into the residency application system should be prioritized. Such tools could give program directors the ability to sort applicants on overall/holistic profiles consisting of many measures, not simply single measures (such as USMLE Step 1), with the intent of finding applicants most suitable to their programs—based on that program's objectives and strengths.”7 

The LCME clearly charges the faculty of a medical school with “the assessment of medical students' progress in developing the competencies that the profession and the public expect of a physician.”8  At its crux, the issue is whether this assessment of competency is a binary or continuous variable. We would maintain that it can be expressed as both to different audiences. To the public, which entrusts medical schools to confer a medical degree upon students, the variable is binary: yes, this student is competent to serve as a physician; or no, they are not. To receiving residency training programs, this variable (as transmitted through clerkship grades, class ranks, and MSPEs) should be expressed as a continuous variable with a demonstration of mastery learning.

Part of the solution must include the ability of the informational technology platforms used in recruiting (most importantly ERAS) to allow for sorting and searching beyond USMLE scores. This would involve 2 components. First, medical schools would need to express their evaluations (including MSPEs and letters of evaluation) in a way that would allow program directors to search and sort. Second, ERAS would need to be able to build the software infrastructure to allow searching and sorting in a user-friendly way. Additional ranking data, such as academic rigor of the medical school or clerkship grading schema, would be helpful in distinguishing medical students from different schools.

More powerful data analytics at the GME level would be helpful, as well. We as a community should strive to measure graduating medical students with all the available metrics, then analyze a receiving training program for its ability to train residents, and finally follow those graduated residents with clinical outcomes in their first few years of practice. Such multivariable analysis would help educators across the spectrum determine which metrics are really important and which are not. The Association of Program Directors in Surgery, as an example, is embarking on constructing such a database that will serve as a model for other specialties.9 

None of these solutions are easy, and all require consensus and cooperation from stakeholders across medical education to ensure fairness and to promote diversity. But solving the problem of inadequate evaluation and standardization is something we owe to the public we serve.

1
Vokes
J,
Greenstein
A,
Carmody
E,
Gorczyca
J.
The current status of medical school clerkship grades in residency applicants
.
J Grad Med Educ
.
2020
;
12
(
2
):
145
149
. doi: .
2
Association of American Medical Colleges
.
Residency Specialties Summary, ERAS Statistics
. ,
2020
.
3
National Resident Matching Program
.
Results and Data, 2019 Main Residency Match
. ,
2020
.
4
National Resident Matching Program
.
Results of the 2018 NRMP Program Director Survey
. ,
2020
.
5
Association of American Medical Colleges
.
Grading systems use by US medical schools
. ,
2020
.
6
Hook
L,
Salami
AC,
Diaz
T,
Friend
KE,
Fathalizadeh
A,
Joshi
ART.
The revised 2017 MSPE: better, but not “outstanding.”
J Surg Educ
.
2018
;
75
(
6
):
e107
e111
. doi:.
7
National Board of Medical Examiners
.
Summary report and preliminary recommendations from the Invitational Conference on USMLE Scoring (InCUS), March 11–12, 2019
. ,
2020
.
8
Liason Committee on Medical Education
.
Standard 6.1, Functions and Structure of a Medical School, Standards for Accreditation of Medical Education Programs Leading to the MD Degree
. ,
2020
.
9
Association of Program Directors in Surgery
.
Request for proposal in support of New Educational Quality Improvement Program (EQIP)
. ,
2020
.