ABSTRACT
Many medical certifying bodies require that a minimum number of clinical procedures be completed during residency training to obtain board eligibility. However, little is known about the relationship between the number of procedures residents perform and their clinical competence.
This study evaluated associations between residents' medical procedure skills measured in a simulation laboratory and self-reported procedure experience and year of training.
This research synthesis extracted and summarized data from multiple cohorts of internal medicine, emergency medicine, anesthesiology, and neurology resident physicians who performed simulated clinical procedures. The procedures were central venous catheter insertion, lumbar puncture, paracentesis, and thoracentesis. We compared residents' baseline simulated performance to their self-reported procedure experience using data from 7 research reports written by Northwestern University investigators between 2006 and 2016. We also evaluated how performance differed by postgraduate year (PGY).
A total of 588 simulated procedures were performed during the study period. We found significant associations between passing the skills examinations and higher number of self-reported procedures performed (P = .011) and higher PGY (P < .001). However, performance for all procedures was poor, as only 10% of residents passed the assessments with a mean of 48% of checklist items correct (SD = 24.2). The association between passing the skills examination and year of training was mostly due to differences between PGY-1 and subsequent years of training.
Despite positive associations between self-reported experience and simulated procedure performance, overall performance was poor. Residents' clinical experience is not a proxy for skill.
Procedural experience requirements for residents are intended to ensure competence in graduates, yet few studies have assessed their validity.
A study of simulated procedure experience in 4 specialties analyzed the impact of prior experience and year of training on competence.
Single institution study; recall bias for procedure experience reporting.
Despite overall poor simulated procedure performance, experience and year of training were positively associated with performance.
Introduction
Many medical accrediting and certifying bodies require that a minimum number of clinical procedures be completed before graduation from residency training or to obtain board eligibility,1,2 and some studies link clinical experience (often expressed as the number of procedures performed) with reduced complications.3–6 However, the applicability of this research to trainee certification is uncertain because the number of procedures needed to reduce complications in these studies is well beyond what might be achieved in normal residency training.3,6,7 Additionally, several recent studies questioned whether clinical experience can serve as a proxy for skill. One systematic review8 dispelled the common notion that physicians with more experience have better clinical skills by showing an inverse relationship between years in practice and quality of care. Another study9 used data from the American Board of Surgery to show that the number of procedures performed by surgical residents was much lower than what would be considered necessary to achieve competence. Within internal medicine, skill acquisition studies evaluating the relationship between the number of self-reported procedures performed during residency and procedure skills similarly failed to show significant correlations.10–15
These studies raise questions regarding how the common method of using the number of procedures performed is considered a surrogate for trainee clinical competence.1 The Accreditation Council for Graduate Medical Education (ACGME) recently changed the expectations of residency and fellowship programs to require use of standardized milestones.16 The Milestone Project denotes progress toward ensuring that training programs are graduating physicians who are competent to perform the tasks they are expected to execute in practice. This change is welcome due to evidence that residents and fellows are often not competent to perform patient care tasks before graduation,9,10,15,17–19 which translates into uneven performance in practice.6,20,21
One method to ensure that trainees are adequately prepared before performing procedures on patients is the use of simulation-based education. Simulation-based education can be used in a mastery model,22 in which participants are required to meet or exceed a minimum passing score (MPS) before the completion of training. In simulation-based mastery learning, time varies, while learning outcomes are uniform. This educational strategy ensures that all clinicians working with patients are competent despite variation in the number of procedures performed in the past.10–15
We performed a research synthesis to evaluate simulation-based mastery learning as a best practice for clinical skills assessment and certification of medical trainees in lieu of relying on clinical experience. We hypothesized that clinical experience (number of procedures performed during residency training or years in practice) is not meaningfully associated with the ability to meet or exceed the MPS for a clinical procedure in a controlled setting. Therefore, the current study had 2 aims: to assess the competence of a large number of medical residents performing a diverse set of clinical procedures, and to evaluate associations between measured procedural skill with self-reported procedure experience and year of training.
Methods
We performed a research synthesis of 7 studies performed by Northwestern University investigators, allowing us access to all data.11–15,17,23 The Northwestern University Institutional Review Board approved all 7 studies. Table 1 summarizes the 7 studies.
These data included simulated procedures from multiple cohorts of internal medicine, emergency medicine, anesthesiology, and neurology resident physicians who performed central venous catheter (CVC) insertion, lumbar puncture (LP), paracentesis, and thoracentesis procedures.11–15,17,23 Resident physicians from 4 academic tertiary medical centers and 1 academic community hospital in Chicago performed these procedures. We compared objective evaluations of residents' baseline simulated performance (before any educational intervention) measured by residents' ability to achieve the MPS (competency standard) to self-reported experience (number of procedures performed) and postgraduate year (PGY) of training. The baseline simulated performance represents “traditional” medical education and informal learning that may have occurred during medical school or residency, before any simulation-based educational interventions.
The 7 published studies provided reliable data from checklist measures of CVC insertion at the internal jugular and subclavian vein sites,12–14 LP,17 paracentesis,11 and thoracentesis skills.15,23 Checklist items were scored dichotomously as steps done correctly versus not done/done incorrectly, and items had high interrater reliability.11–15,17,23 Residents provided demographic and clinical experience information, including age, sex, specialty, year of training, and experience (self-reported number of the specific procedures performed). To comply with accreditation requirements, all residents were required by their programs to keep procedure logs and were able to use these logs to help report the number of procedures performed. Data were collected using the same evaluation tools throughout the study period, which enabled us to combine data from the 7 studies.
As part of training in each of the studies, residents were required to participate in simulation-based mastery learning curricula for procedural skill acquisition, which have been described in detail elsewhere.11–15,17,23 In brief, residents underwent a baseline assessment on a procedure simulator. Subsequently, they watched a video and lecture and participated in deliberate practice on the simulator with directed feedback from an expert instructor. Residents then were required to meet or exceed the MPS at posttest before completion of training. Participants who did not meet the MPS participated in additional deliberate practice until they met or exceeded this score. Educational outcomes were uniform among trainees, while training time varied.
The MPS was calculated in all 7 studies by an expert panel of judges who used the Angoff and Hofstee standard-setting methods to devise a passing score that was judged safe for patient care.11–15,17,23 The MPS for CVC insertion and thoracentesis were subsequently changed in 2010 and 2014, respectively, based on reassessment of resident skills by an expert panel.23,24 For the purposes of this study, the original MPS was used.
Our unit of analysis was baseline simulated procedures, not residents, because more than 1 procedure type may have been done by a resident during training. However, residents performed each procedure only once. Continuous variables, such as age and number of self-reported procedures performed (experience), were changed to categorical variables based on their frequency distributions. We created a “dummy” variable for missing data on a number of procedures performed by each resident (when not reported). We used the χ2 test to evaluate relationships between the percentage of procedures where residents met or exceeded the MPS at baseline (pretest) with the number of procedures performed and PGY. We estimated logistic regression analyses of the association between the number of procedures performed and the likelihood of passing using the 0 procedures (completely inexperienced) category as the reference. We also evaluated the effect of the year of training on the likelihood of passing, using PGY-1 as the reference category. Finally, we estimated a logistic regression of the likelihood of passing with independent variables simultaneously, including the number of procedures performed, year of training, resident age, sex, medical specialty, and procedure type. Although each simulated procedure was considered an independent event, we performed a sensitivity analysis using random-effects logistic regression to adjust standard errors for multiple simulated procedures performed by the same resident.
We performed all statistical analyses using IBM SPSS Statistics version 22 (IBM Corp, Armonk, NY) and Stata version 14 (StataCorp LP, College Station, TX).
Results
A total of 588 baseline measurements of simulated procedures were performed by 382 unique residents during the study period from 2006 to 2016. A total of 143 residents performed both CVC internal jugular and subclavian vein assessments; 38 performed only LP; 49 performed LP and paracentesis; 145 performed only thoracentesis; and 7 performed LP, paracentesis, and thoracentesis. A total of 12 of the 588 procedures (2%) did not include procedural experience reported by residents at the time of the simulated assessments. Resident demographic and clinical data for each procedure can be found in table 2. Resident performers met or exceeded the MPS at baseline assessment for only 59 of 588 procedures (10%). The figure shows the percentage of residents achieving the MPS and passing the baseline (pretest) on each of the individual procedures by number of procedures performed. Overall, baseline simulated procedure performance of all procedures combined was poor, with a mean of 48% of correct checklist items (SD = 24.2).
Abbreviations: CVC, central venous catheter; IJ, internal jugular; SC, subclavian; LP, lumbar puncture.
The χ2 test revealed significant associations between procedure experience and baseline procedure performance of meeting the MPS (P = .011), and a higher number of procedures performed and year of training (P < .001; table 3.) The odds ratios associating simulated procedure performance with number of procedures performed and year of training are shown in table 3. The association between meeting the MPS and year of training was mostly seen as a difference between PGY-1 and PGY-2, PGY-3, and PGY-4 (table 3), not between PGY-2, PGY-3, and PGY-4.
After controlling for independent variables, the association between meeting the MPS and higher experience remained significant. These associations were strongest in the 7 to 10 procedures performed category, where these residents were up to 7.5 times more likely to pass the baseline assessment than those with no experience (P = .001). Yet, the mean overall skills performance (including all procedures) among residents in this experience category was 56% checklist items correct (SD = 24.6), and only 20% of these procedures were performed competently as measured by meeting or exceeding the MPS. The association between meeting the MPS and year of training was no longer seen after controlling for covariates in the regression model. There were no significant associations between meeting the MPS and age, sex, type of procedure, or clinical specialty. The random effects modeling that adjusts results for clustering of procedures by residents produced virtually identical results (data not shown).
Discussion
This research synthesis shows that both procedure experience and year of training were positively associated with baseline competence across multiple clinical procedures. Previous studies failed to show any significant associations between procedure experience and performance.10–15 Combining data from multiple studies increased the power to detect these associations. Despite these associations, only 10% of assessed procedures demonstrated competence at baseline, with low passing rates even among the most experienced and senior trainees. For example, we showed that residents who performed 7 to 10 procedures on actual patients were most likely to meet the MPS on the simulated baseline assessment. However, these residents' overall procedure performance was still poor, with a mean overall skills performance of 56% checklist items correct (SD = 24.6), and the majority of residents (80%, 44 of 55) with this level of experience failed to reach the MPS at baseline assessment. Due to these findings, we believe that procedural experience should not be used as a surrogate for competence in medical trainees.
Few studies have linked an experience threshold to procedure performance outcomes. For instance, 1 study demonstrated that after an educational intervention, residents inserting CVCs had increased their success rate to approximately 90% after 10 ultrasound-guided catheter insertions (compared with approximately 80% after 1 to 3 insertions), and the rate of complications was (and persisted at) approximately 8% after only 4 catheter insertions (compared with approximately 13% after 1 to 3 insertions).25 Another study of CVC insertion suggested that up to 50 procedures were needed to reduce complications from subclavian line insertions,3 while a study of colonoscopy showed that up to 80 procedures were necessary for improved patient care.7 Bariatric surgery yearly volumes of over 150 were also associated with better performance and lower complications compared with lower-volume practitioners.6 These procedural numbers are greater than the minimum requirements for trainees, and they raise questions about the use of procedure tracking without parallel, rigorous, competency-based assessments. For example, the American Board of Internal Medicine specifies that “to assure adequate knowledge and understanding of the common procedures in internal medicine, each resident should be an active participant” for CVC procedures at least 5 times,26 while the American College of Surgeons recognizes CVC insertion as an essential skill, yet does not formally recommend a specific number needed to achieve competency.27
Assumptions of competence related to PGY are also problematic. Our findings show that PGY-1 residents were less likely to meet the MPS compared to residents at PGY-2 and above, but there were no observable differences between PGY-2 and PGY-3 passing rates, while skills seemed to decline slightly in the PGY-4 group. However, these findings were likely due to significant differences in the number of procedures performed between PGYs, because the regression analysis failed to confirm these associations after controlling for covariates (including PGY).
The ACGME has recognized the limitations of linking procedure numbers or time in training to competency.16 Rigorous simulation-based education is a natural fit with the ACGME milestone framework because it provides standardization, deliberate practice, feedback, translation of outcomes to improved patient care, and reliable formative evaluation until a mastery standard is met. For instance, gastroenterology fellows who practiced on a colonoscopy simulator rather than patients performed as well during actual patient colonoscopies as fellows who had already performed 80 procedures on actual patients.7 Simulation-based mastery learning has been used to improve skills in diverse clinical areas, including end-of-life discussions,28 cardiac auscultation,29 management of pediatric status epilepticus,30 advanced cardiac life support,18 and laparoscopic surgery,31 and has also been shown to reduce patient complications,12,14,32–34 decrease length of hospital stay,33 and reduce hospital costs.35,36
This study has several limitations. It was performed at a limited number of centers in 1 city, and it may not reflect other locations. Because local experts determined the checklists and minimum standards, it is possible trainees new to our system performed poorly because they previously trained in a different environment. Finally, prior experience was self-reported and may be subject to recall bias. However, residents were required to keep procedure logs and were able to review these while answering the procedural experience questions.
Conclusion
Our study revealed that experience and year of training were positively associated with procedure performance. However, overall performance was still poor even in the most experienced residents.
References
Author notes
Funding: Central venous catheter insertion simulation was supported by the Excellence in Academic Medicine Act administered by Northwestern Memorial Hospital for the Illinois Department of Health and Family Services. The contributions of Drs Barsuk, Feinglass, and Wayne to this project were also partially supported by grant R18HS021202-01 from the Agency for Healthcare Research & Quality.
Competing Interests
Conflict of interest: The authors declare they have no competing interests.
The authors would like to thank Drs Douglas Vaughan and Kevin O'Leary for their support and encouragement of this work. The authors would also like to thank the resident physicians for their dedication to medical education and patient care.