Under the single GME accreditation system, residency programs receive applicants from MD- and DO-granting medical schools, each of which have their own set of licensing examinations, making concordance studies increasingly relevant. Previous studies comparing Comprehensive Osteopathic Medical Licensing Examination of the United States (COMLEX-USA) and United States Medical Licensing Examination (USMLE) scores have been limited in sample size and examinee composition and have yielded comparisons that may not be generalizable across all applicants. Some osteopathic medical students take USMLE in addition to COMLEX-USA, often at considerable cost and effort, with the aim of making themselves more desirable to potential residency programs. Having more reliable comparisons of COMLEX-USA and USMLE scores would allow program directors to better estimate a score on the alternate examination.
To derive an accurate concordance between COMLEX-USA and USMLE scores, based on a large sample of osteopathic students who took both examinations.
Five colleges of osteopathic medicine, representing various regions of the United States, participated in this study. The data included demographics and COMLEX-USA and USMLE scores from September 2015 through August 2020 for students who took both examinations. We derived the concordance between COMLEX-USA and USMLE scores using equipercentile matching.
Comparisons of demographic characteristics showed only minor differences between the sample and the overall population for COMLEX-USA takers, although scores for the study sample were, on average, greater.
A strong association exists between the scores on the COMLEX-USA and USMLE examinations, allowing prediction of performance on USMLE from COMLEX-USA.
The goal of this study is to use large and diverse samples to derive credible comparisons of achievement of physician candidates who have taken different licensure examinations.
Nonlinear modeling methods allow accurate comparison of Comprehensive Osteopathic Medical Licensing Examination of the United States (COMLEX-USA) Level 1 and Level 2-CE and United States Medical Licensing Examination (USMLE) Step 1 and Step 2 CK.
With our study based on candidates who have taken both COMLEX-USA and USMLE, we cannot be certain of a candidate's motivation for taking an examination that is not required for that candidate's licensure as a physician.
With our large samples we derive concordance tables between COMLEX-USA and USMLE that may reduce the pressure on DO students to take USMLE.
In 2020, we saw the final phase of the transition to a single accreditation system for graduate medical education (GME).1 Residency program directors are considering applicants from all over the United States, both from MD- and DO-granting schools, as well as from international medical graduates (IMGs) certified by the Educational Commission for Foreign Medical Graduates. In trying to evaluate these diverse but similarly well-qualified individuals, program directors look for markers of ability to compare, such as scores on licensing examinations. Those in US MD and IMG pathways take the United States Medical Licensing Examination (USMLE), co-owned by the National Board of Medical Examiners and the Federation of State Medical Boards. Osteopathic physician licensure requires the National Board of Osteopathic Medical Examiners (NBOME) Comprehensive Osteopathic Medical Licensing Examination of the United States (COMLEX-USA).2 Both examinations are comprised of multiple steps, with roughly corresponding timelines. COMLEX-USA Level 1 and USMLE Step 1 are typically taken near or after the end of the second year of medical school, while COMLEX-USA Level 2-CE and USMLE Step 2 CK are typically taken near or after the end of the third year of medical school. Both COMLEX-USA Level 3 and USMLE Step 3 are taken after graduation.
Given the transition to a single GME accreditation system, increasing numbers of DO candidates have taken at least 1 step of the USMLE to augment their residency applications.3 This trend is based on the belief that having USMLE scores allows for more direct comparison with US MD and IMG residency applicants. For those applicants, this adds stress, time, and expense to an already taxing transition to GME. All DO students are required to take and pass COMLEX-USA Level 1 and Level 2-CE prior to graduation, and all US licensing jurisdictions require or accept COMLEX-USA for medical licensure for DOs. While 86% of program directors state that they require COMLEX-USA results for DO students, some programs still request or even require USMLE scores.4 Reasons cited for this include lack of understanding of COMLEX-USA score scales and difficulty comparing performance across these 2 licensing examinations.5 The latter argument persists despite tools such as the percentile score converter from the NBOME.6 In 2019, among DO students, there were approximately 8200 first-time administrations (with the candidate taking the respective examination for the first time) of USMLE Step 1 or Step 2 CK (with the calendar year Step 2 CK number estimated from the reported number for the 2018–2019 and 2019–2020 academic years—July 1 through June 30).3 This is roughly 57% of the total number of first-time administrations, by DO students in 2019, of COMLEX-USA Level 1 or Level 2-CE.
While requesting USMLE scores from DO students yields scores on the same scale for all applicants, this strategy is expensive. At the 2021–2022 price of $645 per examination for Step 1 and Step 2 CK,7 the 8200 additional examinations would cost DO students collectively over $5 million. Furthermore, this requires DO students to spend time and effort outside their curricula to prepare for an examination not precisely aligned with their osteopathic training or future practice. Other suggested solutions, such as abolishment of the COMLEX-USA examination series, fail to recognize the distinctiveness of the DO degree and the available evidence supporting the validity of COMLEX-USA for licensure of osteopathic physicians.8,9 Blueprints for the examinations, although similar with respect to some biomedical science and clinical concepts covered, diverge in areas including test specifications and inclusion of osteopathic principles and practice and osteopathic manipulative treatment on the COMLEX-USA series.10–13
Previous studies have investigated the relationship between performance on these 2 examinations but have fallen short of a dependable way to compare COMLEX-USA and USMLE scores.14–19 Limitations in these investigations have included small sample sizes and student composition (eg, students from one school15,17,18 or one residency program16) and the use of linear models,19 potentially yielding comparisons that may not be generalizable across the entire student population. This leaves some faculty advisors and program directors continuing the redundant cycle of recommending that DO students take USMLE.20
The purpose of this study is to compare the scores on COMLEX-USA Level 1 and Level 2-CE with USMLE Step 1 and Step 2 CK for a large sample of DO students who took both examinations. By employing more sophisticated modeling techniques, our goal was to provide program directors with more accurate concordance information, allowing more credible comparisons of COMLEX-USA and USMLE scores between applicants.
Five colleges of osteopathic medicine, including one with more than one campus, participated in this study. The schools represent the Northeast, Southeast, Midwest, West, and West Coast of the United States, urban and more rural settings, and public and private institutions. The data included USMLE Step 1 scores and examination dates from October 2015 through August 2020, and USMLE Step 2 CK scores and dates from September 2015 through August 2020. Other data for the students in our sample who had taken USMLE, including COMLEX-USA scores and examination dates, dates of birth, gender, and expected graduation year, were collected from the NBOME database. For students who had taken an examination more than once, we considered only the score on the first attempt. To ensure that examination-order effects were unlikely to distort the associations, we considered only the students who took USMLE within 150 days of COMLEX-USA.
Concordance refers to the established relationship between scores on assessments that measure similar but not identical constructs. This applies to USMLE Step 1/COMLEX-USA Level 1 and USMLE Step 2 CK/COMLEX-USA Level 2-CE. Concordance allows stakeholders to compare scores from similar assessments to make decisions. To the extent that both USMLE and COMLEX-USA scores are used to screen residency applicants, concorded scores will be beneficial. COMLEX-USA Level 1 is a problem- and symptom-based assessment, administered in a time-measured environment, which integrates the foundational biomedical sciences and other areas of medical knowledge relevant to solving clinical problems and promoting and maintaining health in providing osteopathic medical care to patients.2 Competency domains assessed include application of osteopathic medical knowledge, osteopathic patient care, and osteopathic principles and practice, communication, professionalism, and ethics. Competency assessment occurs in the context of clinical and patient presentations and systems-based practice as required for entry into the supervised practice of osteopathic medicine as an independently practicing osteopathic generalist physician, and for readiness for lifelong learning and practice-based learning and improvement.10 Scores for both COMLEX-USA examinations in this study are reported on a scale from 9 to 999.
USMLE Step 1 assesses understanding of and ability to apply important concepts of the basic sciences to the practice of medicine, with special emphasis on principles and mechanisms underlying health, disease, and modes of therapy.11 Step 1 ensures mastery of not only the sciences that provide a foundation for the safe and competent practice of medicine in the present, but also the scientific principles required for maintenance of competence through lifelong learning. Step 1 is constructed according to an integrated content outline that organizes basic science material along 2 dimensions: system and process. Scores for both USMLE examinations in this study are reported on a scale from 1 to 300.21
COMLEX-USA Level 2-CE is a 1-day computer-based assessment that integrates application of knowledge in clinical science and foundational biomedical sciences and osteopathic principles, with other physician competencies related to the clinical care of patients and promoting health in supervised clinical settings.12 Competency domains assessed include application of osteopathic medical knowledge, osteopathic patient care and osteopathic principles and practice, communication, systems-based practice, practice-based learning and improvement, professionalism, and ethics. USMLE Step 2 CK assesses an examinee's ability to apply medical knowledge, skills, and understanding of clinical science essential for the provision of patient care under supervision and includes emphasis on health promotion and disease prevention.13 Step 2 CK ensures that due attention is devoted to principles of clinical sciences and basic patient-centered skills that provide the foundation for the safe and competent practice of medicine under supervision.
We examined the representativeness of our sample vis-à-vis the broader COMLEX-USA test-taking population, comparing the respective distributions of student age at the time of the COMLEX-USA, gender, and the timing of each of the examinations within the student's osteopathic medical studies. This timing was calculated using the examination date and the student's expected or actual graduation date, with the assumption that osteopathic medical programs last for 4 years and begin on August 1. Distributions were compared using frequency plots and empirical cumulative distribution plots. As with our sample, if students in the broader COMLEX-USA population took one of the COMLEX-USA examinations more than once, only the first attempt score on that examination was considered.
With our sample of scores by students who took both examinations in rough temporal proximity, we derived the concordance between COMLEX-USA scores and USMLE scores using equipercentile matching of scores.22,23 With this method, the aim is to align the respective COMLEX-USA and USMLE score distributions so that, for any COMLEX-USA score x, the concorded USMLE score y is the score for which the probability of a student scoring y or less on USMLE is the same as the probability of a student scoring x or less on the corresponding COMLEX-USA assessment. For score distributions such as those in this study, the equipercentile method is roughly equivalent to linking the COMLEX-USA score x with the USMLE score y that has the same percentile rank as the score x.
The first step in our application of the equipercentile method was construction of tables of frequencies for each score, for each of the 4 examinations in this study. To help eliminate some of the random variation in score frequencies in our samples, the score distribution for each of the 4 examinations was smoothed using the log-linear method (polynomial degree 3). For COMLEX-USA Level 1, the online supplementary data Figure 1 illustrates both the frequency distribution and the curve resulting from loglinear smoothing. With the assumption that the smoothed distributions provide more accurate estimations of the score distributions for all students who have taken both examinations, equipercentile matching was conducted with the smoothed distributions.
Prediction error due to randomness was estimated from 1000 bootstrap samples from our data. If n is the number of candidates in a sample used to predict scores on “Exam B” (eg, USMLE Step 1) from scores on “Exam A” (eg, COMLEX-USA Level 1), then each bootstrap sample consists of n scores randomly selected with replacement from Exam A and n scores randomly selected with replacement from Exam B. With smoothing and percentile mapping as above performed for each bootstrap sample, the standard error of each concordance projection is the standard deviation of the projections for the 1000 bootstrap samples. The loglinear smoothing, equipercentile matching, and prediction-error calculations23 were performed using the R Equate package.24
All data gathered for this study remain secure and all personal and/or school information remain confidential. The study was approved as human subjects research through expedited review by the NBOME Institutional Review Board (December 18, 2019).
We received data for 2301 students who took both COMLEX-USA Level 1 and USMLE Step 1 and for 1498 students who took both COMLEX-USA Level 2-CE and USMLE Step 2 CK over the respective periods from 2015 to 2020. With our requirement that the concorded examinations be taken no more than 150 days apart, our sample was reduced to 2115 students who have taken both COMLEX-USA Level 1 and USMLE Step 1. We eliminated one outlier from these 2115 records—a student with a COMLEX USA Level 1 score of 264, where the next lowest score in the sample was 325. For the students who took both COMLEX-USA Level 2-CE and USMLE Step 2 CK, our sample was reduced to 1468 students. For students in our fully reduced samples, used to establish concordance, the average time between examinations was 12 days (SD=21) for COMLEX-USA Level 1 and USMLE Step 1, and 12 days (SD=18) for COMLEX-USA Level 2-CE and USMLE Step 2 CK. In the reduced sample of students who took both COMLEX-USA Level 1 and USMLE Step 1, 35% took COMLEX-USA first; in the sample for COMLEX-USA Level 2-CE and USMLE Step 2 CK, 48% took COMLEX-USA first.
Graphical comparison of the sample distributions and overall COMLEX-USA distributions of age when taking the examination, gender, and timing of the examination within the student's osteopathic medical studies showed only minor differences. However, the mean Level 1 score of the sample, 559.8 (SD=81.5), was 25.4 points greater than for the overall Level 1 population (P<.001; 95% CI 22.0-28.8; Cohen's d=0.31), and the mean Level 2-CE score for the sample, 599.9 (SD=92.2), was 44.0 points greater than for the Level 2-CE population (P<.001; 95% CI 39.4-48.7; Cohen's d=0.48).
Online supplementary data figures 2 and 3 show the projections of USMLE scores from COMLEX-USA, plotted against the test scores in the predictive sample. The shaded area around the predictions shows the random error derived from our bootstrap samples.
Sample projections of ranges of COMLEX-USA scores onto ranges of USMLE scores are presented in the Table.
This study establishes concordance relationships between COMLEX-USA Level 1 and USMLE Step 1 and between COMLEX-USA Level 2-CE and USMLE Step 2 CK. The COMLEX-USA and USMLE series represent different examinations with different blueprints, test specifications, and formats; a concorded score is not a perfect predictor of how the student would perform on the other examination. But the relationship is strong enough that student debt and stress can be reduced. More importantly, osteopathic students can focus on the competencies aligned to their school's curricular program and their profession's valid licensure examination program.
With the consolidation of the Accreditation Council for Graduate Medical Education and American Osteopathic Association residency programs under a single accreditation system, some residency program directors who use USMLE scores to evaluate applicants are not comfortable using COMLEX-USA. The NBOME has always advocated for holistic review of residency applicants and warned against the sole use or overuse of any examination score for screening or ranking of residency applicants. The NBOME also supports systemic reform for the current residency application process. However, as long as these examinations are used to evaluate applicants, a simple score conversion application would make this more efficient and effective. The NBOME is currently creating this tool, based on the study results presented here.
Although previous work has been done to establish the concordance between COMLEX-USA and USMLE scores, there have been ongoing blueprint and content changes to both examinations. Furthermore, these studies were based on relatively small student samples, applied linear models (while the score relationship is curvilinear), or did not consider estimation errors.14–19 Simple linear models and correlation coefficients may fail to capture the true relationship between scores, especially at the ends of the ability distribution. Nevertheless, within our study data, the correlation coefficient of 0.82 for COMLEX-USA Level 1 and USMLE Step 1 and coefficient of 0.77 for Level 2-CE and Step 2 CK indicate that students scoring higher on COMLEX-USA have a strong tendency to score higher on USMLE—a relationship between the examinations that is necessary for application of equipercentile matching. Our equipercentile method has been widely used with other assessment programs25 and has been demonstrated as a valid method of modeling relationships between examinations when the constructs and blueprints are similar but different.26,27
With the announcements that the USMLE Step 1 and COMLEX-USA Level 1 examinations will report only pass/fail without numerical scores starting in May 2022, program directors will experience further limits to relying on these examinations to stratify their applicants. The Coalition for Physician Accountability has stressed that program directors should not over-rely on licensing examination scores for resident stratification and selection28 ; reporting only pass/fail should decrease the pressure program directors put on DO applicants to take Step 1 in addition to Level 1. However, the USMLE Step 2 CK and COMLEX-USA Level 2-CE will still report numerical scores. Here, a concorded score will help program directors understand the relative performance of a DO student without expecting the applicant to take the additional examination.
Some students taking both examinations came from DO schools that mandated the taking of USMLE but not necessarily passing it; we cannot predict their level of motivation when they take USMLE. Such students may perform better on COMLEX-USA than on USMLE or vice versa, with the concordance relationship in the scores therefore attenuated. The fact that we limited the analyses to examinations taken in close proximity should counterbalance potential motivation and examination order effects.
Based on a large sample of osteopathic medical students who took both COMLEX-USA and USMLE examinations, there are strong concordance relationships between scores on these similarly measured constructs.
The authors would like to thank the advisory committee for COMLEX-USA Level 2-CE that reviewed and provided helpful suggestions for this research, and NBOME staff, specifically Dr. John Gimpel, Karen Huelsman, and Shirley Bodett, who provided comments on prior versions of this manuscript.