Abstract
The graduate medical education community uses results from the United States Medical Licensing Examination (USMLE) to inform decisions about individuals' readiness for postgraduate training.
We sought to determine the relationship between performance on the USMLE and the American Board of Anesthesiology (ABA) Part 1 Certification Examination using a national sample of examinees, and we considered the relationship in the context of undergraduate medical education location and examination content.
Approximately 7800 individuals met inclusion criteria. The relationships between USMLE scores and ABA Part 1 pass rates were examined, and predictions for the strength of the relationship between USMLE content areas and ABA performance were compared with observed relationships.
Pearson correlations between ABA Part 1 scores and USMLE Steps 1, 2 (clinical knowledge), and 3 scores for first-taker US/Canadian graduates were .59, .56, and .53, respectively. A clear relationship was demonstrated between USMLE scores and pass rates on ABA Part 1, and content experts were able to successfully predict the USMLE content categories that would least or most likely relate to ABA Part 1 scores.
The analysis provided evidence on a national scale that results from the USMLE and the ABA Part 1 were correlated and that success on the latter examination was associated with level of USMLE performance. Both testing programs have been successful in conceptualizing many of the knowledge areas of interest and in developing test content to reflect those areas.
The graduate medical education community uses results from the USMLE to inform decisions about individuals' readiness for postgraduate training.
There is a clear relationship between USMLE scores and pass rates on the ABA Part 1, and content experts were able to successfully predict the USMLE content categories that would least or most likely relate to ABA Part 1 scores.
Single-specialty study limits generalizability.
USMLE results, together with other important markers, can be useful and informative in the residency admission process for anesthesiology.
Introduction
All graduates of medical schools accredited by the Liaison Committee on Medical Education (LCME) who are seeking a medical license in the United States, as well as all international medical graduates (IMGs) seeking postgraduate training and licensure opportunities in this country, must take and pass the United States Medical Licensing Examination (USMLE). Although the primary purpose of the USMLE is to support the licensing decision, USMLE results are often used by the undergraduate education community to inform decisions about an individual's readiness for promotion and graduation, and they are often used by the graduate education community to inform decisions about the individual's readiness to move into postgraduate training. Related to the latter use, there has been recent interest in gaining a better understanding, on a national level, of the applicant characteristics that influence admission to graduate medical education positions in different specialties, including performance on the licensing examination.1,2
To date, few published studies have examined the relationship between licensing examination results on a national level, but those that have show a moderately high, positive relationship between performance on licensing and certification examinations.3,4 There are also a modest number of studies that have shown, on a local level, a positive relationship between licensing examinations and certification outcomes.5–,8 In the field of anesthesiology, the focus for this study, there has been a national investigation of the relationship between in-training and certification examinations,9 but the relationship between the licensing examination and the American Board of Anesthesiology (ABA) Certification Examination has not been studied to date.
The purpose of this retrospective observational study was to determine the relationship between performance on the USMLE and the ABA Part 1 Certification Examination using a national, multiyear sample. Also analyzed was whether the relationships varied based on location of undergraduate education (United States/Canada or international), and the predictability of the relationship of performance on USMLE content areas to ABA Part 1 Certification Examination performance.
Methods
Examinations
ABA Certification Examination
The examination system for the ABA's primary certificate has 2 distinct parts: the Part 1 Examination and the Part 2 Examination. The ABA Part 1 Examination assesses the candidate's knowledge of basic and clinical sciences as applied to anesthesiology. This study used the results of the ABA Part 1 Examination over several consecutive years.
US Medical Licensing Examination
The USMLE has 3 components, or “Step” examinations. Overall results and subscores for all 3 examinations were used in the investigation, except for the Clinical Skills portion of Step 2, which was introduced to the USMLE relatively recently and had not yet been taken by a significant number of the Part 1 candidates. For Step 2, only the Clinical Knowledge (CK) component was used. Although the results of Step 3 are not typically considered for admission into residency programs, they were included in this analysis to broaden understanding of the relationship of constructs purportedly measured by the 2 assessment programs. Participants were approximately 7800 individuals who (1) attempted the ABA Part 1 Examination for the first time from 2002 through 2007 and (2) completed their residency program between September of the year prior to each year's examination and August of the year of the examination. Of these individuals, 7008 were identified as having complete ABA and USMLE records and were included in the analysis. As part of the registration process for both programs, examinees are notified that their deidentified examination results may be used for research purposes. Of the approximately 800 individuals without a complete USMLE record, most were graduates of schools of osteopathic medicine. These individuals have an alternate licensing examination route available to them through the National Board of Osteopathic Medical Examiners; results from this program were not available for the investigation.
As part of the registration process for both programs, examinees are notified that deidentified, aggregated results from the examination may be used for research purposes, and for this reason IRB review was not sought.
Scores
Scores for the ABA Certification Examination are reported on a standardized scale based on the 1995 Base Reference Group. This process fixes the scale for all subsequent testing years, allowing comparisons and the ability to better recognize and understand trends. Reported scores for the Base Reference Group of the ABA Examination were scaled to have a mean of 250 and an SD of 50. A minimum score of 209 on this scale was required to pass the examination between 2002 and 2007.
Scores on the USMLE Step 1, Step 2 CK, and Step 3 examinations are reported on a 3-digit score scale, which, for reasons similar to those described above, was identified and fixed at the beginning of the USMLE program in the early 1990s. The scale for each Examination was established to have a mean score of 200 and an SD of 20 for students and graduates of LCME-accredited US and Canadian medical schools who were taking each test for the first time in the initial administrations.
Analysis
After identification of all ABA and USMLE records, total test score means and SDs were calculated for the first-time attempt of the study group (N = 7008) on the ABA Part 1 Examination and each of the USMLE examinations. This information was further broken down by location of undergraduate medical education. Pearson correlation coefficients were calculated between the ABA Part 1 score and each of the 3 USMLE Step Examination scores, as well as among the USMLE measures themselves. This calculation was done for the overall study population, and also for the subset of examinees who were US/Canadian medical school graduates (USMGs). Correlations for the USMG group were also corrected for the unreliability of scores, yielding a true correlation. The true correlation is an estimate of what the observed correlation would be if the measurements were perfectly reliable, and it is intended to facilitate a better understanding of the relationship between the underlying constructs assessed by the examinations of interest.10
To investigate the relationship between USMLE performance and success on the ABA Certification program, initial pass rates on the ABA Part 1 Examination were reviewed as a function of total test performance on each of the USMLE Step examinations. To assess whether the relationship between USMLE and ABA Part 1 Examination performance is meaningful from a content perspective, ABA and USMLE content experts (chairs of 17 USMLE test material development committees and 4 members of the ABA Research Committee) were asked to predict the relationship between USMLE performance on the major content areas represented in USMLE and overall ABA Part 1 performance for all examinees.
The experts were provided with a description of the content of the examination used by the other organization and asked to predict how well they thought each of the USMLE content areas would relate to ABA Part 1 Examination performance. A 9-point scale was used, ranging from “unrelated” to “highly related.” Pearson correlation coefficients were calculated between each of the USMLE content areas and the total ABA Part 1 Examination score. Because subscores on the USMLE content areas are not statistically equated from year to year and are therefore not reported on a scale that easily allows year-to-year comparisons, correlations were calculated within each of the USMLE examination years for which there were 50 or more participant examinees, and the median correlation across examination years was calculated. The correlations were corrected for unreliability, in the manner described previously, yielding true correlations.
Results
table 1 provides counts, means, and SDs overall and by location of undergraduate training for the ABA Part 1 and for the USMLE Steps 1, 2 CK, and 3. From 2002 through 2007 there was, with some minor variations, a general increase in the number of examinees taking the ABA Part 1, there was a general increase in ABA mean scores, and pass rates fluctuated in the range of 81% to 88%. Mean scores in USMLE increased during this period, paralleling a national trend in increasing scores for the program. The percentage of ABA Part 1 examinees who were USMGs increased from 59% to 83%, with a proportional decrease in IMGs. figure 1 shows the change in mean performance on the ABA Part 1 during that same period for USMGs and IMGs separately.
figure 1 American Board of Anesthesiology (ABA) Part 1 Mean Scores, by ABA Exam Year, for Examinees Grouped by Location of Undergraduate Medical Education (United States/Canadian Medical Graduates or International Medical Graduates)
figure 1 American Board of Anesthesiology (ABA) Part 1 Mean Scores, by ABA Exam Year, for Examinees Grouped by Location of Undergraduate Medical Education (United States/Canadian Medical Graduates or International Medical Graduates)
American Board of Anesthesiology (ABA) Part 1 and United States Medical Licensing Examination (USMLE) Step Examination Counts, Mean Scores, and SDs for Part 1 Examination Years 2002–2007, by Location of Undergraduate Medical Education

table 2 provides observed (Pearson) correlation coefficients among the 4 total test scores for the total group of examinees. It also provides observed correlations for USMGs only, along with true correlations statistically corrected for the unreliability of the scores using reliability coefficients based on the performance of first-time USMG takers. The groups used for the USMLE reliability coefficient calculations were further restricted to those who graduated from LCME-accredited medical schools.
For American Board of Anesthesiology (ABA) Part 1 and United States Medical Licensing Examination (USMLE) Examinations, Observed Correlations for All Examinees in the Study Group: Reliability Coefficients, Observed Correlations, and True Correlations (in Parentheses) for United States Medical Graduates (USMGs)

For the total group and for USMGs only, the observed correlations between the ABA Part 1 and USMLE were all moderately high and positive. The true correlations between the ABA Part 1 and USMLE were .64 in all instances. All correlations between ABA Part 1 and USMLE were lower than those found among the various Steps of the USMLE.
figure 2 reflects, for the overall examinee group, the relationship between the first-attempt passing rate on the ABA Part 1 and the first-attempt total test performance on the USMLE. USMLE Step scores are represented in 10-point intervals. Results indicate that across all 3 Step examinations, higher USMLE performance was strongly related to higher passing rates on the ABA Part 1 Examination.
figure 2 Relationship Between American Board of Anesthesiology Part 1 Examination Pass Rates and United States Medical Licensing Examination Step Examination Scores
figure 2 Relationship Between American Board of Anesthesiology Part 1 Examination Pass Rates and United States Medical Licensing Examination Step Examination Scores
For USMGs and IMGs separately, figure 3 displays the relationship between ABA Part 1 pass rates and performance on all USMLE examinations; a strong relationship between Step examination performance and ABA pass rates persists, but with some variation across examinee subgroups, especially for those with relatively low USMLE scores.
figure 3 Relationship Between American Board of Anesthesiology Part 1 Examination Pass Rates and United States Medical Licensing Examination Scores for Examinees Grouped by Location of Undergraduate Medical Education
figure 3 Relationship Between American Board of Anesthesiology Part 1 Examination Pass Rates and United States Medical Licensing Examination Scores for Examinees Grouped by Location of Undergraduate Medical Education
table 3 provides a list of the USMLE content categories related to medical disciplines, organ systems, or diseases that were either in the highest 10 or lowest 10 based on the strength of their relationship with ABA Part 1 that was predicted by content experts. Also provided are the calculated true correlation and an indication of whether the true correlation was in the top, middle, or bottom third of all true correlations calculated. As can be noted, for most of the categories listed, there was high agreement between expert prediction and the magnitude of the true correlations. Many of these relationships make sense conceptually, given the focus of the ABA Part 1. The only notable exceptions were for the Step 1 content area of microbiology (low predicted relationship, high actual relationship) and the Step 3 content area of circulatory/blood (high predicted relationship, low actual relationship).
True Correlations Between US Medical Licensing Examination (USMLE) Content Categories and American Board of Anesthesiology (ABA) Part 1 for Categories Predicted to Have the Highest (n = 10) and Lowest (n = 10) Relationships, as Well as an Indication of Whether the True Correlation Was in the Highest, Middle, or Lowest Third of All True Correlations Calculated

Discussion
In considering the relationship between outcomes from the 2 assessment programs, overall results of this analysis suggest a moderately high, positive relationship between scores obtained on each of the USMLE Step examinations and performance on the ABA Part 1 Certification Examination. In addition, the level of performance on the USMLE Step examinations showed a clear relationship to success on ABA Part 1. The reader should recognize, however, that the results reflect average values during this period, and the relationship between USMLE performance and ABA Part 1 performance may vary by individual.
Not unlike the studies cited previously,3,4 other investigators have found significant relationships between sequenced multiple choice examinations. Such correlations were established between admissions tests and written board examinations,11 between the sequenced components of the USMLE,12 between undergraduate medical specialty “shelf” examinations and USMLE steps,13,14 and between USMLE step exams and residency specialty in-training or written examinations.15–,17
It is difficult to identify with certainty the reasons for the strength and the patterns of relationships detected in this analysis. It is not unreasonable to attribute some of these results to the often-shared observation that well-educated and highly motivated students and graduates do well on standardized examinations, no matter what the content. Although this might explain in part the relationship between the 2 assessment programs, it is equally reasonable to suggest that those individuals who acquire a strong foundation in the basic sciences (as assessed in the Step 1 program) and who are able to apply that foundation in a safe and effective manner in a variety of clinical situations (as assessed by the Step 2 and Step 3 programs) are likely to be successful in their anesthesiology training and experience (as reflected in the ABA Part 1 outcomes).
The final component of the overall investigation involved a comparison of examination performance with content expert expectations for the strength of relationships between ABA Part 1 Examination performance and the various content areas covered by the USMLE Step examinations. As reported for categories that received either relatively high or relatively low predictions, the data generally support the conclusion that a panel of experts succeeded in choosing the USMLE subject matter most and least relevant to performance on a subsequent certifying anesthesiology examination. Although there is no “target value” for a particularly meaningful association, the overall pattern of agreement suggests that the test developers were successful in developing test content that reflected the intended knowledge areas.
This outcome provides support for the validity of both testing programs. The fact that the content relationships that were most clearly expected were also so frequently supported by the data suggest that both testing programs have enjoyed some success in conceptualizing the knowledge and skills of interest, in organizing test designs to appropriately reflect those concepts, and in developing test content to reflect the design.
The study has several limitations. First, the investigation included only a single specialty-board examination and a limited number of cohorts within that specialty. Although the results are consistent with those from studies in other specialties, replications in additional specialties are clearly desirable. The second limitation relates to the time period studied. Although use of 6 national cohorts is a major strength of the study, the most recent cohort included took Part 1 in 2007. Replication with more recent Part 1 cohorts is desirable. Although large changes in the overall relationship between the measures would not be anticipated with the more recent cohorts, there may be subtler shifts of patterns that might relate to such things as shifts in perception about the desirability of the various disciplines, limits on resident duty hours, or the continued development of competency-based approaches to education and assessment. Third, the study did not investigate program-to-program variation in the nature of the relationship between Step and Part 1 scores. Differences in the selectivity of programs, the resulting impact on the educational environment, and variation in other program characteristics may affect the nature and the strength of the relationship between Step scores and Part I performance.
It is important that those who use USMLE data to inform decisions about residency program admissions understand that the inferences to be made about the successful USMLE examinee are best guided by an understanding of the purpose and focus of the USMLE tests and by the value that the residency program places on the individual's demonstration of a solid foundation of basic and clinical science knowledge. The results of this investigation generally support the conclusion that USMLE results, used with other important markers, can be useful and informative in the residency admission process.
References
Author notes
Gerard F. Dillon, PhD, is Vice President for United States Medical Licensing Examination at the National Board of Medical Examiners; David B. Swanson, PhD, is Vice President for Program Development and Special Projects at the National Board of Medical Examiners; Joseph C. McClintock, PhD, is Senior Program Manager with Measurement Inc; and Glenn P. Gravlee, MD, is Professor of Anesthesiology at the University of Colorado, and a former Director of the American Board of Anesthesiology.
Funding: All work was funded by the American Board of Anesthesiology and the National Board of Medical Examiners.